A comprehensive evaluation of collapsing methods using simulated and real data: Excellent annotation of functionality and large sample sizes required

Carmen Dering, Inke R. König, Laura B. Ramsey, Mary V. Relling, Wenjian Yang, Andreas Ziegler*

*Corresponding author for this work
9 Citations (Scopus)

Abstract

The advent of next generation sequencing (NGS) technologies enabled the investigation of the rare variant-common disease hypothesis in unrelated individuals, even on the genome-wide level. Analysis of this hypothesis requires tailored statistical methods as single marker tests fail on rare variants. An entire class of statistical methods collapses rare variants from a genomic region of interest (ROI), thereby aggregating rare variants. In an extensive simulation study using data from the Genetic Analysis Workshop 17 we compared the performance of 15 collapsing methods by means of a variety of pre-defined ROIs regarding minor allele frequency thresholds and functionality. Findings of the simulation study were additionally confirmed by a real data set investigating the association between methotrexate clearance and the SLCO1B1 gene in patients with acute lymphoblastic leukemia. Our analyses showed substantially inflated type I error levels for many of the proposed collapsing methods. Only four approaches yielded valid type I errors in all considered scenarios. None of the statistical tests was able to detect true associations over a substantial proportion of replicates in the simulated data. Detailed annotation of functionality of variants is crucial to detect true associations. These findings were confirmed in the analysis of the real data. Recent theoretical work showed that large power is achieved in gene-based analyses only if large sample sizes are available and a substantial proportion of causing rare variants is present in the gene-based analysis. Many of the investigated statistical approaches use permutation requiring high computational cost. There is a clear need for valid, powerful and fast to calculate test statistics for studies investigating rare variants.

Original languageEnglish
Article number323
JournalFrontiers in Genetics
Volume5
Issue numberSEP
ISSN1664-8021
DOIs
Publication statusPublished - 2014

Fingerprint

Dive into the research topics of 'A comprehensive evaluation of collapsing methods using simulated and real data: Excellent annotation of functionality and large sample sizes required'. Together they form a unique fingerprint.

Cite this