Open Access
2008 Statistical advances and challenges for analyzing correlated high dimensional SNP data in genomic study for complex diseases
Yulan Liang, Arpad Kelemen
Statist. Surv. 2: 43-60 (2008). DOI: 10.1214/07-SS026

Abstract

Recent advances of information technology in biomedical sciences and other applied areas have created numerous large diverse data sets with a high dimensional feature space, which provide us a tremendous amount of information and new opportunities for improving the quality of human life. Meanwhile, great challenges are also created driven by the continuous arrival of new data that requires researchers to convert these raw data into scientific knowledge in order to benefit from it. Association studies of complex diseases using SNP data have become more and more popular in biomedical research in recent years. In this paper, we present a review of recent statistical advances and challenges for analyzing correlated high dimensional SNP data in genomic association studies for complex diseases. The review includes both general feature reduction approaches for high dimensional correlated data and more specific approaches for SNPs data, which include unsupervised haplotype mapping, tag SNP selection, and supervised SNPs selection using statistical testing/scoring, statistical modeling and machine learning methods with an emphasis on how to identify interacting loci.

Citation

Download Citation

Yulan Liang. Arpad Kelemen. "Statistical advances and challenges for analyzing correlated high dimensional SNP data in genomic study for complex diseases." Statist. Surv. 2 43 - 60, 2008. https://doi.org/10.1214/07-SS026

Information

Published: 2008
First available in Project Euclid: 28 March 2008

zbMATH: 1196.62144
MathSciNet: MR2520980
Digital Object Identifier: 10.1214/07-SS026

Keywords: Complex disease , high dimensional data , single nucleotide polymorphism , Statistical methods

Rights: Copyright © 2008 The author, under a Creative Commons Attribution License

Vol.2 • 2008
Back to Top