Protein structure predictions.
Today, the sequence of many complete genomes, ranging from microbes to human, is known. In order to exploit this potential, genome sequence information must be decoded in terms of the molecular and cellular functions of the gene products. For many applications, in particular the development of new drugs, information on function must be complemented by knowledge of protein tertiary structure. Pcons and Pmembr have proven to perform very well on detecting distantly related globular and membrane proteins respectively. Pcons is a fully automated consensus (or meta) prediction server for fold recognition. The last CASP evaluations have shown that Pcons performs on top among automatic predictors.
Large multiple sequence alignments can also be statistically analyzed to reveal coupled mutations within a protein. These evolutionary couplings give rise to spatial proximity of residue pairs and can be used as constraints during the protein folding process. PconsC is a meta-predictor for residue contacts. In its latest version 2.0, it improves the accuracy of single prediction methods by 30%. Our most recent development, PconsE, combines PconsC 2.0 contact prediction with Rosetta ab-initio folding. Given just the amino acid sequence as an input PconsE is able to generate native-like protein structures.
We have identified the importance of structural features including interface helices as well as developed improved predictors.
Our work on evolution of multi-domain proteins have shown that they evolve by the addition of a single domain at the N- or C-terminals. However, proteins with long repeats show a different evolutionary pattern.