New statistical models developed to further understanding of genes that influence complex disorders
Given that modern DNA sequencing technology results in large amounts of complex genetic data, examining genetic variants, especially those that are rare, can be challenging in statistical research. Meta-analysis, which is a method of combining data from different studies, can be useful in examining genetic data in that it can increase the chance of finding true, significant results. Dr. Mei-Ling Lee, professor in the Department of Epidemiology and Biostatistics, and colleagues conducted the first study to utilize meta-analysis methods to examine pleiotropy, or genes that affect multiple traits.
In their study, the authors developed statistical models, specifically multivariate functional linear models (MFLM), which were used to examine associations between multiple variations of genes and traits using data from different studies. In order to assess the associations between multiple genetic variants and traits, the authors examined three explicit statistics: approximate F-distributed test statistics based on Pillai-Bartlett trace, Hotelling-Lawley trace, and Wilks’s Lambda. MFLM were found to perform well as determined by data simulations examining false-positive rates, or the amount of false associations detected, and power, or the likelihood that the statistical methods can detect true effects that are present.
The new statistical methods were then applied to analyze four lipid traits, including low-density lipoprotein (LDL) cholesterol or “bad” cholesterol, high-density lipoprotein (HDL) cholesterol or “good” cholesterol, triglycerides, and total cholesterol, using data from eight European cohorts. The analyses found that the assessment of multiple studies and traits was better than examining separate individual studies and single traits as indicated by more and stronger associations with different genetic variants.
Some strengths of using MFLM are that it can take into account missing genetic data and it can simplify complex genetic data by treating genetic data as functions. Moreover, data on the physical positions of genes are available in most studies and can be directly used in MFLM. Because information on the location of genes is assessed, MFLM does require individual-level data. Thus, the findings from this study are valuable for future studies that have genetic data on individuals and can be used to inform future research that analyzes data on multiple genetic variants and traits using meta-analysis.
The development of new statistical methods, like the MFLM in this study, that improve how genetic data is synthesized and analyzed will enable researchers to better understand how differences in genes can impact various traits, such as those related to complex disorders. By enhancing understanding of pleiotropy (the production by a single gene of multiple apparently unrelated effects), this research can potentially contribute to treatments in the future that are tailored to the genetic make-up of individuals.