The growth of multi-omics datasets (in which SNP, methylation, miRNA expression, gene expression, and proteomic data are all collected for the same set of samples) provides an exciting avenue to investigate the interplay between genetic, epigenetic, transcriptomic, and metabolic processes in the cell. Unfortunately, however, most analysis tools are designed for a single omics modality in isolation, leaving open the question of how to integrate these data. Building on our earliest genomics work, we have recently published a series of approaches that successfully integrate genetic (SNP), epigenetic (miRNA expression), and transcriptomic (gene expression) information to provide insights about gene regulation across these layers.
How Can Integrative Computational Methods Solve Real World Problems?
Elucidating the function of microRNAs – first discovery of regQTLs
An important application of our integrative analysis strategies comes from a series of studies [Study 1, Study 2, Study 3] devoted to advancing our understanding of how microRNAs (miRNAs) regulate gene expression. miRNAs are short, noncoding RNA molecules approximately 8–10 nucleotides long that bind mRNA transcripts, thus preventing their translation or inducing degradation. Because of their short sequences, miRNAs bind nonspecifically, making their mechanistic role unclear. We hypothesized that this non-specificity may enable miRNAs to exert system–level control. Using a nonlinear dimension reduction strategy to summarize gene activity at the pathway level, we were able to identify miRNAs that appear to regulate global patterns of gene expression in a pathway, as seen in Study 2. We next hypothesized that this system–level control could be compromised by genetic alterations that affect the ability of miRNAs to bind the critical components, and systematically searched a large public omic database (TCGA) for evidence of genetic variants that modified the association between miRNA expression and the expression of other genes. In Study 1, we reported the first discovery of “regQTLs”—genetic variants that alter how miRNAs regulate gene expression. Building on our prior highly–cited work in molecular dynamics simulation and theory, we are now carrying out molecular dynamics simulations to probe miRNA:gene binding energies to investigate the biophysical basis for the statistical associations we discovered. Together, these projects serve as an illustration of the innovative computational systems biology research done in our group. They leverage expertise in machine learning (nonlinear dimension reduction for pathway summarization), statistics for “big data” and network analysis, and physics (mechanistic validation through biophysical simulation and analysis).