Analysis of high-throughput ancient DNA sequencing data

Martin Kircher*

*Corresponding author for this work

Abstract

Advances in sequencing technologies have dramatically changed the field of ancient DNA (aDNA). It is now possible to generate an enormous quantity of aDNA sequence data both rapidly and inexpensively. As aDNA sequences are generally short in length, damaged, and at low copy number relative to coextracted environmental DNA, high-throughput approaches offer a tremendous advantage over traditional sequencing approaches in that they enable a complete characterization of an aDNA extract. However, the particular qualities of aDNA also present specific limitations that require careful consideration in data analysis. For example, results of high-throughout analyses of aDNA libraries may include chimeric sequences, sequencing error and artifacts, damage, and alignment ambiguities due to the short read lengths. Here, I describe typical primary data analysis workflows for high-throughput aDNA sequencing experiments, including (1) separation of individual samples in multiplex experiments; (2) removal of protocol-specific library artifacts; (3) trimming adapter sequences and merging paired-end sequencing data; (4) base quality score filtering or quality score propagation during data analysis; (5) identification of endogenous molecules from an environmental background; (6) quantification of contamination from other DNA sources; and (7) removal of clonal amplification products or the compilation of a consensus from clonal amplification products, and their exploitation for estimation of library complexity.

Original languageEnglish
Title of host publicationAncient DNA : Methods and Protocols
EditorsBeth Shapiro, Michael Hofreiter
Number of pages32
Publication date2012
Pages197-228
ISBN (Print)9781617795152
DOIs
Publication statusPublished - 2012

Fingerprint

Dive into the research topics of 'Analysis of high-throughput ancient DNA sequencing data'. Together they form a unique fingerprint.

Cite this