Aligning protein structures using distance matrices and combinatorial optimization

Inken Wohlers*, Lars Petzold, Francisco S. Domingues, Gunnar W. Klau

*Corresponding author for this work


Structural alignments of proteins are used to identify structural similarities. These similarities can indicate homology or a common or similar function. Several, mostly heuristic methods are available to compute structural alignments. In this paper, we present a novel algorithm that uses methods from combinatorial optimization to compute provably optimal structural alignments of sparse protein distance matrices. Our algorithm extends an elegant integer linear programming approach proposed by Caprara et al. for the alignment of protein contact maps. We consider two different types of distance matrices with distances either between C atoms or between the two closest atoms of each residue. Via a comprehensive parameter optimization on HOMSTRAD alignments, we determine a scoring function for aligned pairs of distances. We introduce a negative score for non-structural, purely sequence-based parts of the alignment as a means to adjust the locality of the resulting structural alignments. Our approach is implemented in a freely available software tool named PAUL (Protein structural Alignment Using Lagrangian relaxation). On the challenging SISY data set of 130 reference alignments we compare PAUL to six state-of-the-art structural alignment algorithms, DALI, MATRAS, FATCAT, SHEBA, CA, and CE. Here, PAUL reaches the highest average and median alignment accuracies of all methods and is the most accurate method for more than 30%of the alignments. PAUL is thus a competitive tool for pairwise high-quality structural alignment.

Original languageEnglish
Pages33 - 43
Number of pages11
Publication statusPublished - 2009
EventGerman Conference on Bioinformatics 2009 - Halle-Wittenberg, Germany
Duration: 28.09.200930.09.2009


ConferenceGerman Conference on Bioinformatics 2009
Abbreviated titleGCB 2009


Dive into the research topics of 'Aligning protein structures using distance matrices and combinatorial optimization'. Together they form a unique fingerprint.

Cite this