Protein Threading
See also:
Threading is a method for the computational prediction of protein structure from protein sequence.
Protein threading or fold recognition refers to a class of
computational methods for predicting the structure of a protein from
amino acid sequence. The basic idea is that the target sequence (the
protein sequence for which the structure is being predicted) is threaded through the backbone structures of a collection of template proteins (known as the fold library)
and a “goodness of fit” score calculated for each sequence-structure
alignment. This goodness of fit is often derived in terms of an
empirical energy function, based on statistics derived from known
protein structures, but many other scoring functions have been proposed
and tried over the years. The most useful scoring functions include
both pairwise terms (interactions between pairs of amino acids) and solvation
terms. Threading methods share some of the characteristics of both
comparative modelling methods (the sequence alignment aspect) and ab initio prediction methods (predicting structure based on identifying low-energy conformations of the target protein).
Fold recognition methods can be broadly divided into two types: 1.
methods that derive a 1-D profile for each structure in the fold
library and align the target sequence to these profiles; 2. methods
that consider the full 3-D structure of the protein template. A simple
example of a profile representation would be to take each amino acid in
the structure and simply label it according to whether it is buried in
the core of the protein or exposed on the surface. More elaborate
profiles might take into account the local secondary structure (e.g.
whether the amino acid is part of an alpha helix)
or even evolutionary information (how conserved the amino acid is). In
the 3-D representation, the structure is modelled as a set of
inter-atomic distances i.e. the distances are calculated between some
or all of the atom pairs in the structure. This is a much richer and
far more flexible description of the structure, but is much harder to
use in calculating an alignment. The profile-based fold recognition
approach was first described by Bowie, Lüthy and Eisenberg in 1991. The
term threading was first coined by Jones, Taylor and Thornton
in 1992, and originally referred specifically to the use of a full 3-D
structure atomic representation of the protein template in fold
recognition. Today, the terms threading and fold recognition are
frequently (though somewhat incorrectly) used interchangeably.
Fold recognition methods are widely used and effective because it is
believed that there are a strictly limited number of different protein
folds in nature, mostly as a result of evolution but also due to
constraints imposed by the basic physics and chemistry of polypeptide
chains. There is, therefore, a good chance (currently 70-80%) that a
protein which has a similar fold to the target protein has already been
studied by X-ray crystallography or NMR spectroscopy and can be found
in the PDB (Protein Data Bank). Currently there are just over 1100 different protein folds known (see CATH database statistics for latest view), but new folds are still being discovered every year thanks in part to the ongoing structural genomics projects.
Many different algorithms have been proposed for finding the correct
threading of a sequence onto a structure, though many make use of dynamic programming in some form. For full 3-D threading, the problem of identifying the best alignment is very difficult (it is an NP-hard problem) and researchers have made use of many combinatorial optimization methods such as simulated annealing or branch and bound searching to arrive at heuristic solutions.
It is interesting to compare threading methods to methods which attempt to align two protein structures (Protein structural alignment), and indeed many of the same algorithms have been applied to both problems.
References
JU. Bowie, R. Lüthy, D. Eisenberg (1991) A method to identify
protein sequences that fold into a known three-dimensional structure.
Science. 253:164-170.
DT. Jones, WR. Taylor, JM. Thornton (1992) A new approach to protein fold recognition. Nature. 358, 86-89.
RH. Lathrop (1994) The protein threading problem with sequence amino
acid interaction preferences is NP-complete. Protein Eng. 7:1059-1068.
DT. Jones, C. Hadley (2000) Threading methods for protein structure
prediction. (In) Bioinformatics: Sequence, structure and databanks.
Higgins, D. & Taylor, W.R. Eds., pp1-13, Springer-Verlag,
Heidelberg.
This article is licensed under the GNU Free Documentation License. It uses material from Wikipedia Encyclopedia article "Protein Threading"
|