Integration of Biological Knowledge and Genomic Data to Identify New Disease Genes
One proposed strategy is to combine pathway analysis of GWAS outcomes and epistasis analysis. This strategy, applied to cutaneous melanoma, identified five melanoma-associated pathways (response to light stimulus, regulation of mitotic cell cycle, induction of programmed cell death, cytokine activity, oxidative phosphorylation) and significant interaction between TERF1 and AFAP1L2 genes. This finding has important biological relevance given the key role of TERF1 in telomere biology and the emerging role of telomere dysfunction in melanoma development.
Another strategy is to conduct network-based analysis by integrating genome-wide SNP data and protein-protein interaction networks. We proposed a novel algorithm for network analysis that was applied to GWAS outcomes of two large asthma datasets. We identified a sub-network of 91 genes associated with childhood asthma, of which 22 represent novel candidates.
Finally, we integrated DNA variation and epigenomic (DNA methylation) data to provide more insight into the biological mechanisms underlying asthma plus rhinitis. We showed that a CpG site within the MTNR1A gene (receptor to melatonin) mediates the association between a genetic variant, lying close to MTNR1, and disease.
In conclusion, integration of different types of data together with various sources of biological information can help uncovering the complex mechanisms underlying multifactorial diseases and, thus, allows making further progress towards personalized medicine.