Skip to main content

Tools & Resources

Lab Developed Methods

Take a Look at Our Lab’s Methods!

Our lab develops methods for system level analysis of high-throughput data. Primarily written in the R programming language, our packages and tools can be downloaded from GitHub, BioConductor, or as R packages! Depending on the dataset size, our methods can be run locally from your machine, or may require a high performance computer cluster, such as Quest at Northwestern. Find links and description to our methods below!

Please feel free to contact us if you have questions or are having issues with implementing our code!

Network Analysis Methods

 

GeneSurrounder

Network-Based Pathway Analysis

Paper
GitHub Code

GeneSurrounder, a new algorithm that ranks genes based on the evidence that they are sources of disruption on the network of interacting genes. Since the effects of a “disruptive” source gene would propagate outward in the interaction network, we find these genes by searching for a telltale pattern of attenuating and correlated biological signal in the data.

 

Time-lagged Ordered Lasso for network inference

Network Inference

Paper
GitHub Code

Accurate gene regulatory networks can be used to explain the emergence of different phenotypes, disease mechanisms, and other biological functions. We adapted the time-lagged Ordered Lasso, a regularized regression method with temporal monotonicity constraints, for de novo reconstruction. We also developed a semi-supervised method that embeds prior network information into the Ordered Lasso to discover novel regulatory dependencies in existing pathways.

 

postPSLR

Network Inference

Paper
GitHub Code

Inferring the structure of gene regulatory networks from high-throughput datasets remains an important and unsolved problem. We developed a semi-supervised network reconstruction algorithm that enables the synthesis of information from partially known networks with time course gene expression data. We adapted partial least square-variable importance in projection (VIP) for time course data and used reference networks to simulate expression data from which null distributions of VIP scores are generated and used to estimate edge probabilities for input expression data.

 

PoDA

Network-Based Pathway Analysis

Paper
Code

  • PoDA.R – Main script to perform the PoDA calculations
  • PoDA-example.R – Example usage of PoDA.R
  • PoDA-example-data.RData – Data used in PoDA-example.R (necessary for example)
  • plotSvals.R – A script to generate the boxplots of the S values, similar to those shown in the paper.

PoDA is a pathway-based, multi-SNP analysis method for GWAS data. The method is based upon the hypothesis that if a pathway is related to disease risk, cases will appear more similar to other cases than to controls for the SNPs associated with that pathway; by systematically applying the method to all pathways of potential interest, we can identify those for which the hypothesis holds true, i.e., pathways containing SNPs for which the samples exhibit greater within-class similarity than across classes. PoDA improves on existing single-SNP and SNP-set enrichment analyses in that it does not require the SNPs in a pathway to exhibit independent main effects.

Circadian Biology Methods

TimeSignature

Circadian Time Prediction

Paper
GitHub Code

TimeSignature is a machine-learning approach to predict physiological time based on gene expression in human blood. A powerful feature is TimeSignature’s generalizability, enabling it to be applied to samples from disparate studies and yield highly accurate results despite systematic differences between the studies. This quality is unique among expression-based predictors and addresses a major challenge in the development of reliable and clinically useful biomarker tests.

Resources

Getting Up To Speed!

Getting up to speed on the ever growing field of computational biology research can be a challenge. Below are links to a range of coding, biology, and statistics resources we have found useful in our research. 

High-Throughout Data Analysis Resources

Network Analysis Resources

Getting Up To Speed with Network Analysis

  1. Coming Soon!

Integrative Omics Analysis Resources

Getting Up To Speed with Integrative Omics Analysis

  1. Coming Soon!

Circadian Biology Resources

Getting Up To Speed with Circadian Biology

  1. Coming Soon!

Coding Resources

Getting Up To Speed with R

  1. Coding Cheat Sheets
    1. Rstudio IDE Cheat Sheet
    2. R markdown Cheat Sheet
    3. ggplot2 Cheat Sheet
    4. shinyApps Cheat Sheet
    5. Data Transformation with dplyr Cheat Sheet
    6. Data Import Cheat Sheet
    7. R Package Development Cheat Sheet

Getting Up To Speed with GitHub

  1. GitHub User Guide

Getting Up To Speed with Northwestern Quest Super Computer

  1. Quest User Guide

Other Useful Links

  1. Data Visualization – Color Selection
    1. Color Brewer
    2. Color Picker
  2. Textbooks
    1. An Introduction to Statistical Learning with Applications in R – By Gareth James, Daniela Witten, Trevor Hastie, and Robert Tibshirani

Print Friendly, PDF & Email