TY - JOUR
T1 - Mapping proteins in the presence of paralogs using units of coevolution.
AU - El-Kebir, Mohammed
AU - Marschall, Tobias
AU - Wohlers, Inken
AU - Patterson, Murray
AU - Heringa, Jaap
AU - Schönhuth, Alexander
AU - Klau, Gunnar W.
PY - 2013
Y1 - 2013
N2 - We study the problem of mapping proteins between two protein families in the presence of paralogs. This problem occurs as a difficult subproblem in coevolution-based computational approaches for protein-protein interaction prediction. Similar to prior approaches, our method is based on the idea that coevolution implies equal rates of sequence evolution among the interacting proteins, and we provide a first attempt to quantify this notion in a formal statistical manner. We call the units that are central to this quantification scheme the units of coevolution. A unit consists of two mapped protein pairs and its score quantifies the coevolution of the pairs. This quantification allows us to provide a maximum likelihood formulation of the paralog mapping problem and to cast it into a binary quadratic programming formulation. CUPID, our software tool based on a Lagrangian relaxation of this formulation, makes it, for the first time, possible to compute state-of-the-art quality pairings in a few minutes of runtime. In summary, we suggest a novel alternative to the earlier available approaches, which is statistically sound and computationally feasible.
AB - We study the problem of mapping proteins between two protein families in the presence of paralogs. This problem occurs as a difficult subproblem in coevolution-based computational approaches for protein-protein interaction prediction. Similar to prior approaches, our method is based on the idea that coevolution implies equal rates of sequence evolution among the interacting proteins, and we provide a first attempt to quantify this notion in a formal statistical manner. We call the units that are central to this quantification scheme the units of coevolution. A unit consists of two mapped protein pairs and its score quantifies the coevolution of the pairs. This quantification allows us to provide a maximum likelihood formulation of the paralog mapping problem and to cast it into a binary quadratic programming formulation. CUPID, our software tool based on a Lagrangian relaxation of this formulation, makes it, for the first time, possible to compute state-of-the-art quality pairings in a few minutes of runtime. In summary, we suggest a novel alternative to the earlier available approaches, which is statistically sound and computationally feasible.
UR - http://www.scopus.com/inward/record.url?scp=84901288220&partnerID=8YFLogxK
U2 - 10.1186/1471-2105-14-S15-S18
DO - 10.1186/1471-2105-14-S15-S18
M3 - Journal articles
C2 - 24564758
AN - SCOPUS:84901288220
SN - 1367-4803
VL - 14 Suppl 15
SP - S18
JO - BMC Bioinformatics
JF - BMC Bioinformatics
M1 - S18
ER -