Influence of tree topology restrictions on the complexity of haplotyping with missing data

Michael Elberfeld*, Ilka Schnoor, Till Tantau

*Corresponding author for this work


Haplotyping, also known as haplotype phase prediction, is the problem of predicting likely haplotypes based on genotype data. One fast haplotyping method is based on an evolutionary model in which a perfect phylogenetic tree is sought that explains the observed data. Unfortunately, when data entries are missing, which is often the case in laboratory data, the resulting formal problem ipph, which stands for incomplete perfect phylogeny haplotyping, is NP-complete. Even radically simplified versions, such as the restriction to phylogenetic trees consisting of just two directed paths from a given root, are still NP-complete; but here, at least, a fixed-parameter algorithm is known. Such drastic and ad hoc simplifications turn out to be unnecessary to make ipph tractable: we present the first theoretical analysis of a parameterized algorithm, which we develop in the course of the paper, that works for arbitrary instances of ipph. This tractability result is optimal insofar as we prove ipph to be NP-complete whenever any of the parameters we consider is not fixed, but part of the input.

Original languageEnglish
JournalTheoretical Computer Science
Pages (from-to)38-51
Number of pages14
Publication statusPublished - 11.05.2012


Dive into the research topics of 'Influence of tree topology restrictions on the complexity of haplotyping with missing data'. Together they form a unique fingerprint.

Cite this