Sparse Coding for Feature Selection on Genome-wide Association Data

Ingrid Brænne, Kai Labusch, Amir Madany Mamlouk


Genome-wide association (GWA) studies provide large amounts of high-dimensional data. GWA studies aim to identify variables that increase the risk for a given phenotype. Univariate examinations have provided some insights, but it appears that most diseases are affected by interactions of multiple factors, which can only be identified through a multivariate analysis. However, multivariate analysis on the discrete, high-dimensional and low-sample-size GWA data is made more difficult by the presence of random effects and nonspecific coupling. In this work, we investigate the suitability of three standard techniques (p-values, SVM, PCA) for analyzing GWA data on several simulated datasets. We compare these standard techniques against a sparse coding approach; we demonstrate that sparse coding clearly outperforms the other approaches and can identify interacting factors in far higher-dimensional datasets than the other three approaches.
TitelArtificial Neural Networks – ICANN 2010
Redakteure/-innenKonstantinos Diamantaras, Wlodek Duch, Lazaros S. Iliadis
Herausgeber (Verlag)Springer Verlag
ISBN (Print)978-3-642-15818-6
ISBN (elektronisch)978-3-642-15819-3
PublikationsstatusVeröffentlicht - 08.2010
Veranstaltung20th International Conference Artificial Neural Networks
- Thessaloniki, Griechenland
Dauer: 15.09.201018.09.2010


Untersuchen Sie die Forschungsthemen von „Sparse Coding for Feature Selection on Genome-wide Association Data“. Zusammen bilden sie einen einzigartigen Fingerprint.