Saturday, February 16, 2013
Auditorium/Exhibit Hall C (Hynes Convention Center)
Identification of gene-environment interactions has significant consequences for public health intervention and complex disease etiology. Knowledge of how genes and environment interact can provide guidelines on how a genetically predisposed person can reduce environmental exposure thereby reducing disease risk. The standard approaches of single marker tests are problematic for large genome-wide association studies due to the large number of tests conducted, where multiple testing corrections that do not sufficiently account for the correlation of the tests often results in low power. Besides suffering from low power, existing multi-marker tests are also inadequate as they may give unstable results due to the large models fitted. In this paper, we propose a computationally-efficient and powerful gene-environment set association test, called GESAT, which allows simultaneous testing of gene-environment interactions under a generalized linear model framework. We first group single nucleotide polymorphisms (SNPs) based on biologically meaningful criteria, and then test the grouped SNPs jointly for gene-environment interactions. GESAT accounts for correlation between terms tested, leading to reduced degrees of freedom and increased power. The development of GESAT is motivated by a problem in lung cancer in which we want to investigate whether the effect of variant(s) in 15q24-25.1 region on lung cancer risk is moderated by smoking. Using simulated SNP data in 15q24-25.1 region based on HapMap CEU population, we show that GESAT performs well. Lastly, we apply GESAT to data from the Harvard Lung Cancer Study to investigate the gene-environment interactions between SNPs in 15q24-25.1 region and smoking.