Visualizations Projects 2016

The projects below demonstrate the work IDEAS fellows and trainees accomplished during their Visualization for Multi-Dimensional Data (FSS Visualizations) Course.  The images featured in each presentation utilize a combination of visual tools, including–but not limited to–Matplotlib, Paraview, Vtk, and D3.js.  Please read the student quotes for a synopsis of each project.

 

Arjun Punjabi: MRI Visualization of Alzheimer’s Patients

This work is a supplement to my current research in the areas of computer vision and machine learning. The goal of my research is to leverage a large database of medical images taken during an Alzheimer’s disease (AD) clinical study in order to develop an algorithm for computer-aided, or–even better–automated diagnosis of the disease. The methodology takes advantage of a class of state-of-the-art machine learning algorithms that fall under the category of artificial neural networks. These algorithms, typically referred to as “deep learning”, have been used in a variety of applications that range from financial analytics to self-driving cars.

This visualization project examines the idea of hand-engineering the current data.  AD is a neurodegenerative disease responsible for dementia, so, from a physical point of view, an AD patient will exhibit brain volume shrinkage, especially in the hippocampus and surrounding brain regions. One would think that an algorithm that is simply sensitive to brain volume changes that focus on the hippocampal region would be sufficient. However, the MRI visualizations of healthy and AD patients in my study show that this approach would be too simplistic.

I took the MRI scans in each data set and averaged them, yielding two comprehensive visualizations: an “average” healthy brain and an “average” AD brain. The images in my slides are two slices from each of these volumes. One can see that there is no discernible difference between the categories. In fact, there is only a 2.68% difference in voxel intensity between the average volumes. This shows that attempting to hand-engineer data features would be a futile effort given that humans cannot make out a perceptible visual difference. This validates the current methodology of applying deep learning to the problem. While it may be impossible for humans to detect and design for the changes that occur with AD, a deep learning algorithm can find minute patterns in the data. In fact, my current algorithm operates at 92% classification accuracy. Hopefully, further analysis will result in a tool that can be used to help doctors and patients deal with this disease.

 

Neda Rohani: Automatic Pigment Identification of Hyperspectral Paintings by Using Sparse Modeling

Here, I present the results of applying sparse unmixing to the problem of pigment identification in hyperspectral painting for two datasets. The first set is a mock up painting for which we know the combination and the second painting is a Roman-Egyptian portrait for which we have the ground truth information of only 6 pixels.

Slide 3, Fig. 1. (a) Pigments spectra, (b) Average spectra of indigo, hematite and the mixture, (c) Comparison of the estimated similarity coefficient of four algorithms (SAM, SCM, linear unmixing and sparse unmixing).

Slide 4, Fig. 2. (a) Studied Portrait (arrows point to locations where ground truth pigment composition is known), (b) Coefficient map of indigo, (c) Coefficient map of hematite.

I used Matlab for the visualizations.

 

Niharika Sravan: Quick Estimate of Possible Binary Progenitors of Type IIb SNe 

binary_pspace
Click image to visit interactive visualization webpage.

With the advent of rapid-cadence big-data telescopes like the LSST and the wealth of data on the transient sky that will be generated every single day, the bottleneck in making ground-breaking discoveries may be how quickly we find interesting phenomena and characterize them. To prepare for the challenge of finding something when we don’t know what we’re looking for, researchers can vet their algorithms on phenomena that are better understood. Supernovae are one of the leading candidates to accomplish this with.

My visualization is a small step in this direction. It displays the family of solutions for binary progenitors of a particular class of supernovae, known as Type IIb supernovae, matching observational constraints for their physical properties. It is intended for researchers interested in getting a quick idea of the parameter space that match properties of supernova progenitors inferred from observations. This visualization was developed with d3.

 

Vivian Tang: Analyze the Seismic Delay Time Data to Investigate the Earth’s Structure Beneath East Asia

I am interested in exploring the Earth’s velocity structure beneath East Asia. So far, we do not know complete structure of the Earth, but we know the velocity structure would affect seismic delay times.  Based on this theory, I measured the body waves’ delay time data from the International Seismological Centre (ISC) from 1960 to 1970 year, and applied GMT and Matplotlib to make histograms and density plots to present the delay time data, then I found many tectonic structure information from these results. Thus, good visualizations would help us to reconstruct an image of the Earth’s velocity structure.

 

Vicky Chuqiao Yang: Visualizing the United States Congress

Click here to visit interactive visualization webpage.

The democrat and republican parties have been dominating the U.S. politics for most of its history. In the 2016 presidential race, a large portion of the public are not satisfied with either party’s nominee. This brings increasing attention to third party possibilities. A survey shows that 58% of U.S. adults say a third political party is needed in the U.S., and a majority has typically supported this position since 2007. Recent surveys also find an increasing amount of U.S. voters would consider a generic third-party nominee, with 47% for the 2016 presidential election.

In this project, I created a way to visualize the U.S. congress members’ political positions throughout its history (data is available from govtrack.org), with the focus on distinguishing members of democrat and republican parties from other parties and independents.  Since the roll call data are hard to visualize, I use the DW-NOMINATE method (Dynamic, Weighted, Nominal Three-Step Estimation), a scaling method to reduce the dimensions. The method finds pairwise distance between each two congress members from measuring how often two members vote together or against each other on bills, then project the result to a 2-dimensional space. The 1st dimension, which explains most of the variance in data, can be interpreted as government intervention in the economy; while the second dimension can be interpreted as position on social issues such as slavery and civil rights. This method also allows for cross time comparison, using the congress members who served multiple terms in congress as “bridges”.

From the visualization, we can see that the U.S. Congress has become more divided by party lines over time. The Democrat and Republican parties have been moving to political extremes since the 1950’s. In the period of Democrat and Republican parties dominate the political landscape, the representation from third party has decreased over time.

 

Michael Zevin: Visualization of LIGO Noise: Gravity Spy

As the most sensitive gravitational experiment ever created, the advanced Laser Interferometer Gravitational-wave Observatory (aLIGO) is highly susceptible to transient non-cosmic noise artifacts called glitches. Since glitches make it more difficult to detect gravitational waves, the proper characterization and analysis of such noise features is useful for improving aLIGO’s sensitivity to gravitational waves.

Over the past year, I have worked to develop a project which combines crowd-sourcing and machine learning to classify glitches in aLIGO data. These two tools feed off of each other to build a superior classifier; output of the crowd-sourcing component helps to train machine learning algorithms, and machine learning algorithms feed the most questionable glitches back to the citizen volunteers for further morphological classification.

This visualization shows machine learning classification results from all of the glitches recorded by aLIGO during its first observing run. The colors indicate the type of glitch that the machine learning algorithms predicted for each event. By selecting and deselecting boxes, one can limit the number of glitches one is viewing, making it easier to see trends in the properties of a glitch category, and similarities between different categories. This visualization was made using d3, a tool for creating web-based visualizations. It is primarily useful for members of the LIGO Scientific Collaboration who wish to analyze the state of the detectors and rid problematic noise, bettering aLIGO’s sensitivity to the gravitational universe.