Abstract
Genome-wide association (GWA) studies provide large amounts of high-dimensional data. GWA studies aim to identify variables that increase the risk for a given phenotype. Univariate examinations have provided some insights, but it appears that most diseases are affected by interactions of multiple factors, which can only be identified through a multivariate analysis. However, multivariate analysis on the discrete, high-dimensional and low-sample-size GWA data is made more difficult by the presence of random effects and nonspecific coupling. In this work, we investigate the suitability of three standard techniques (p-values, SVM, PCA) for analyzing GWA data on several simulated datasets. We compare these standard techniques against a sparse coding approach; we demonstrate that sparse coding clearly outperforms the other approaches and can identify interacting factors in far higher-dimensional datasets than the other three approaches.
Originalsprache | Englisch |
---|---|
Titel | Artificial Neural Networks – ICANN 2010 |
Redakteure/-innen | Konstantinos Diamantaras, Wlodek Duch, Lazaros S. Iliadis |
Seitenumfang | 10 |
Band | 6352 |
Herausgeber (Verlag) | Springer Verlag |
Erscheinungsdatum | 08.2010 |
Seiten | 337-346 |
ISBN (Print) | 978-3-642-15818-6 |
ISBN (elektronisch) | 978-3-642-15819-3 |
DOIs | |
Publikationsstatus | Veröffentlicht - 08.2010 |
Veranstaltung | 20th International Conference Artificial Neural Networks - Thessaloniki, Griechenland Dauer: 15.09.2010 → 18.09.2010 |