Vy-PER: Eliminating false positive detection of virus integration events in next generation sequencing data

Michael Forster*, Silke Szymczak, David Ellinghaus, Georg Hemmrich, Malte Rühlemann, Lars Kraemer, Sören Mucha, Lars Wienbrandt, Martin Stanulla, Andre Franke

*Corresponding author for this work
31 Citations (Scopus)

Abstract

Several pathogenic viruses such as hepatitis B and human immunodeficiency viruses may integrate into the host genome. These virus/host integrations are detectable using paired-end next generation sequencing. However, the low number of expected true virus integrations may be difficult to distinguish from the noise of many false positive candidates. Here, we propose a novel filtering approach that increases specificity without compromising sensitivity for virus/host chimera detection. Our detection pipeline termed Vy-PER (Virus integration detection bY Paired End Reads) outperforms existing similar tools in speed and accuracy. We analysed whole genome data from childhood acute lymphoblastic leukemia (ALL), which is characterised by genomic rearrangements and usually associated with radiation exposure. This analysis was motivated by the recently reported virus integrations at genomic rearrangement sites and association with chromosomal instability in liver cancer. However, as expected, our analysis of 20 tumour and matched germline genomes from ALL patients finds no significant evidence for integrations by known viruses. Nevertheless, our method eliminates 12,800 false positives per genome (80× coverage) and only our method detects singleton human-phiX174-chimeras caused by optical errors of the Illumina HiSeq platform. This high accuracy is useful for detecting low virus integration levels as well as non-integrated viruses.

Original languageEnglish
Article number11534
JournalScientific Reports
Volume5
ISSN2045-2322
DOIs
Publication statusPublished - 13.07.2015

Funding

Eike Zell administrated the childhood acute lymphoblastic leukemia sequencing consortium. Markus Schilhabel, Melanie Friskovec, Catharina von der Lancken, and Melanie Schlapkohl performed whole genome next-generation sequencing. Dr. Eva Ellinghaus helped to organise the project and edited the manuscript. Jan Christian Kässens provided informatics support to FPGA users. Dr. Adam Grundhoff advised on the differences between retroviral and HBV integration. We gratefully acknowledge Vy-PER presentation feedback at the HiTSeq 2014 conference. This work was supported by the German Federal Office for Radiation Protection (BfS); the German Federal Ministry of Education and Research (BMBF) within the framework of the e:Med research and funding concept (grant number 01ZX1306A, sysINFLAME); the Deutsche Forschungsgemeinschaft (DFG) Clusters of Excellence ‘Inflammation at Interfaces’; and the EU Seventh Framework Programme [FP7/2007-2013, grant number 262055, ESGI].

Research Areas and Centers

  • Research Area: Medical Genetics

DFG Research Classification Scheme

  • 2.22-01 Epidemiology, Medical Biometry/Statistics

Fingerprint

Dive into the research topics of 'Vy-PER: Eliminating false positive detection of virus integration events in next generation sequencing data'. Together they form a unique fingerprint.

Cite this