SNPboost: Interaction analysis and risk prediction on GWA data

Ingrid Brænne, Jeanette Erdmann, Amir Madany Mamlouk

Abstract

Genome-wide association (GWA) studies, which typically aim to identify single nucleotide polymorphisms (SNPs) associated with a disease, yield large amounts of high-dimensional data. GWA studies have been successful in identifying single SNPs associated with complex diseases. However, so far, most of the identified associations do only have a limited impact on risk prediction. Recent studies applying SVMs have been successful in improving the risk prediction for Type I and II diabetes, however, a drawback is the poor interpretability of the classifier. Training the SVM only on a subset of SNPs would imply a preselection, typically by the p-values. Especially for complex diseases, this might not be the optimal selection strategy. In this work, we propose an extension of Adaboost for GWA data, the so-called SNPboost. In order to improve classification, SNPboost successively selects a subset of SNPs. On real GWA data (German MI family study II), SNPboost outperformed linear SVM and further improved the performance of a non-linear SVM when used as a preselector. Finally, we motivate that the selected SNPs can be put into a biological context.
Original languageEnglish
Title of host publicationArtificial Neural Networks and Machine Learning – ICANN 2011
EditorsTimo Honkela, Włodzisław Duch, Mark Girolami, Samuel Kaski
Number of pages8
Volume6792
PublisherSpringer Verlag
Publication date2011
Pages111-118
ISBN (Print)978-3-642-21737-1
ISBN (Electronic)978-3-642-21738-8
DOIs
Publication statusPublished - 2011
Event21st International Conference on Artificial Neural Networks - Espoo, Finland
Duration: 14.06.201117.06.2011

Fingerprint

Dive into the research topics of 'SNPboost: Interaction analysis and risk prediction on GWA data'. Together they form a unique fingerprint.

Cite this