Protein-Protein Interaction Prediction
See also:
Protein-protein interaction prediction is a field combining bioinformatics and structural biology in an attempt to identify and catalog interactions between pairs or groups of proteins. Understanding protein-protein interactions
is important in investigating intracellular signaling pathways.
Experimentally, interactions between pairs of proteins are inferred
from yeast two-hybrid systems, from affinity purification/mass spectrometry assays, or from protein microarrays. In parallel to the experimental determination of the interactome, computational methods are being developed.
Methods
Proteins that interact are more likely to co-evolve[1][2][3][4],
therefore it is possible to make inferences about interactions between
pairs of proteins based on their phylogenetic distances. It has also
been observed in some cases that pairs of interacting proteins have
fused orthologues in other organisms. In addition, a number of bound
protein complexes have been structurally solved and can be used to
identify the residues that mediate the interaction so that similar
motifs can be located in other organisms.
Phylogenetic profiling
Phylogenetic profiling [5]
finds pairs of protein families with similar patterns of presence or
absence across large numbers of species. This method identifies pairs
likely to act in the same biological process, but does not necessarily
imply physical interaction.
Prediction of co-evolved protein pairs based on similar phylogenetic trees
This method[6] involves using a sequence search tool such as BLAST for finding homologues of a pair of proteins, then building multiple sequence alignments with alignment tools such as Clustal.
From these multiple sequence alignments, phylogenetic distance matrices
are calculated for each protein in the hypothesized interacting pair.
If the matrices are sufficiently similar (as measured by their Pearson correlation coefficient) they are deemed likely to interact.
Identification of homologous interacting pairs
This method [7]
consists of searching whether the two sequences have homologues which
form a complex in a database of known structures of complexes. The
identification of the domains is done by sequence searches against
domain databases such as Pfam using BLAST.
If more than one complex of Pfam domains is identified, then the query
sequences are aligned using a hidden Markov tool called HMMER
to the closest identified homologues, whose structures are known. Then
the alignments are analysed to check whether the contact residues of
the known complex are conserved in the alignment.
Identification of structural patterns
This method[8][9] builds a library of known protein-protein interfaces from the PDB, where the interfaces are defined as pairs of polypeptide fragments that are below a threshold slightly larger than the Van der Waals radius
of the atoms involved. The sequences in the library are then clustered
based on structural alignment and redundant sequences are eliminated.
The residues that have a high (generally >50%) level of frequency
for a given position are considered hotspots[10].
This library is then used to identify potential interactions between
pairs of targets, providing that they have a known structure (i.e.
present in the PDB).
Bayesian network modelling
Bayesian methods [11]
integrate data from a wide variety of sources, including both
experimental results and prior computational predictions, and use these
features to assess the likelihood that a particular potential protein
interaction is a true positive result. These methods are useful because
experimental procedures, particularly the yeast two-hybrid experiments,
are extremely noisy and produce many false positives, while the
previously mentioned computational methods can only provide
circumstantial evidence that a particular pair of proteins might
interact.
Relationship to docking methods
The field of protein-protein interaction prediction is closely related to the field of protein-protein docking,
which attempts to use geometric and steric considerations to fit two
proteins of known structure into a bound complex. This is a useful mode
of inquiry in cases where both proteins in the pair have known
structures and are known (or at least strongly suspected) to interact,
but since so many proteins do not have experimentally determined
structures, sequence-based interaction prediction methods are
especially useful in conjunction with experimental studies of an
organism's interactome.
Servers
References
- ^ Dandekar
T., Snel B.,Huynen M. and Bork P. (1998) "Conservation of gene order: a
fingerprint of proteins that physically interact." Trends Biochem. Sci. (23),324-328
- ^ Enright
A.J.,Iliopoulos I.,Kyripides N.C. and Ouzounis C.A. (1999) "Protein
interaction maps for complete genomes based on gene fusion events." Nature (402), 86-90
- ^ Marcotte
E.M., Pellegrini M., Ng H.L., Rice D.W., Yeates T.O., Eisenberg D.
(1999) "Detecting protein function and protein-protein interactions
from genome sequences." Science (285), 751-753
- ^ Pazos F., Valencia A. (2001). "Similarity of phylogenetic trees as indicator of protein-protein interaction." Protein Engineering, 9 (14), 609-614
- ^ Pellegrini
M, Marcotte EM, Thompson MJ, Eisenberg D, Yeates TO. (1999) "Assigning
protein functions by comparative genome analysis: protein phylogenetic
profiles." Proc Natl Acad Sci U S A., 96, 4285-8
- ^ Tan S.H., Zhang Z., Ng S.K. (2004) "ADVICE: Automated Detection and Validation of Interaction by Co-Evolution." Nucl. Ac. Res., 32 (Web Server issue):W69-72.
- ^ Aloy P.,Russell R.B. "InterPreTS: Protein Interaction Prediction through Tertiary Structure." Bioinformatics, 19 (1), 161-162
- ^ Aytuna
A. S., Keskin O., Gursoy A. (2005) "Prediction of protein-protein
interactions by combining structure and sequence conservation in
protein interfaces." Bioinformatics, 21 (12), 2850-2855
- ^ Ogmen U., Keskin O., Aytuna A.S., Nussinov R. and Gursoy A. (2005) "PRISM: protein interactions by structural matching." Nucl. Ac. Res.,33 (Web Server issue):W331-336
- ^ Keskin
O., Ma B. and Nussinov R. (2004) "Hot regions int protein-protein
interactions: The organization and contribution of structurally
conserved hot spot residues" J. Mol. Biol., (345),1281-1294
- ^
Jansen R, Yu H, Greenbaum D, Kluger Y, Krogan NJ, Chung S, Emili A,
Snyder M, Greenblatt JF, Gerstein M. (2003) A Bayesian networks
approach for predicting protein-protein interactions from genomic
data." Science, 302(5644):449-53.
This article is licensed under the GNU Free Documentation License. It uses material from Wikipedia Encyclopedia article "Protein-Protein Interaction Prediction"
|
|