Computational and statistical methods are increasingly central to biomedical research as high-throughput genomic assays produce ever-larger datasets. Our research involves analytical methods that enable us to elucidate the genetic and epigenetic basis of cancer and other diseases from genomic datasets. Towards this end we design preprocessing methods that are required to transform raw data from high-throughput genomic assays into reliable measures of the underlying biological process. We also develop statistical methods for integrative analysis of multiple genomic and epigenomic data types.

We have applied these preprocessing and integrative analysis methods towards mapping and understanding disease-associated epigenetic dysregulation. Epigenetic mechanisms involve chemical and physical changes to the structure of DNA without changes in the underlying sequence. Epigenetics allows for fine-tuned control of gene expression and is the basic mechanism whereby a single human genome can give rise to over 200 different cell types in the adult organism. Our computational tools for mapping epigenomic landscapes have recently been applied in collaborative projects in cancer, stem cell biology and development. For example, we demonstrated evidence of incomplete epigenetic reprogramming when transforming adult differentiated cells into pluripotent stem cells, highlighting a significant hurdle for regenerative medicine. We also have interests in epidemiology where we are developing DNA methylation mapping methods to lay the foundations for epigenome-wide association studies in common disease. This involves a central challenge of epigenomics: deciphering causal relationships between disease and genetic and epigenetic factors.