On the complexity of SNP block partitioning under the perfect phylogeny model

Jens Gramm, Tzvika Hartman, Till Niehoff, Roden Sharan, Till Tantau

Abstract

Recent technologies for typing single nucleotide polymorphisms (SNPs) across a population are producing genome-wide genotype data for tens of thousands of SNP sites. The emergence of such large data sets underscores the importance of algorithms for large-scale haplotyping. Common haplotyping approaches first partition the SNPs into blocks of high linkage-disequilibrium, and then infer haplotypes for each block separately. We investigate an integrated haplotyping approach where a partition of the SNPs into a minimum number of non-contiguous subsets is sought, such that each subset can be haplotyped under the perfect phylogeny model. We show that finding an optimum partition is
-hard even if we are guaranteed that two subsets suffice. On the positive side, we show that a variant of the problem, in which each subset is required to admit a perfect path phylogeny haplotyping, is solvable in polynomial time.
Original languageEnglish
JournalDiscrete Mathematics
Volume2009
Issue number309(18)
Pages (from-to)5610-5617
Publication statusPublished - 2009

Cite this