Protein structure prediction and classification.pptx
DrVardhanaJ
320 views
20 slides
Aug 12, 2024
Slide 1 of 20
1
2
3
4
5
6
7
8
9
10
11
12
13
14
15
16
17
18
19
20
About This Presentation
Protein structure prediction is the prediction of the three-dimensional structure of a protein from its amino acid sequence — that is, the prediction of its folding and its secondary, tertiary, and quaternary structure from its primary structure
Size: 1.02 MB
Language: en
Added: Aug 12, 2024
Slides: 20 pages
Slide Content
Protein Structure Prediction and Classification Bioinformatics In Support Of Proteomic Research Dr.J . Vardhana Assistant Professor, Dept of Biotechnology VISTAS
Introduction Protein structure prediction is the prediction of the three-dimensional structure of a protein from its amino acid sequence — that is, the prediction of its folding and its secondary, tertiary, and quaternary structure from its primary structure. Structure prediction is fundamentally different from the inverse problem of protein design. Protein structure prediction is one of the most important goals pursued by bioinformatics and theoretical chemistry; it is highly important in medicine (for example, in drug design) and biotechnology (for example, in the design of novel enzymes).
Methods There are three major theoretical methods for predicting the structure of proteins: Homology Modelling or Comparative Modelling Fold Recognition or Threading Ab Initio Prediction.
Comparative Modelling Comparative modelling exploits the fact that evolutionarily related proteins with similar sequences, as measured by the percentage of identical residues at each position based on an optimal structural superposition, have similar structures. The similarity of structures is very high in the so-called ``core regions'', which typically are comprised of a framework of secondary structure elements such as alpha-helices and beta-sheets. Loop regions connect these secondary structures and generally vary even in pairs of homologous structures with a high degree of sequence similarity.
Fold Recognition Or “Threading" Threading uses a database of known three-dimensional structures to match sequences without known structure with protein folds. This is accomplished by the aid of a scoring function that assesses the fit of a sequence to a given fold. These functions are usually derived from a database of known structures and generally include a pairwise atom contact and solvation terms. Threading methods compare a target sequence against a library of structural templates, producing a list of scores.
Fold Recognition Or "Threading" The scores are then ranked and the fold with the best score is assumed to be the one adopted by the sequence. The methods to fit a sequence against a library of folds can be extremely elaborate computationally, such as those involving double dynamic programming, dynamic programming with frozen approximation, Gibbs Sampling using a database of ``threading'' cores, and branch and bound heuristics, or as ``simple'' as using sophisticated sequence alignment methods such as Hidden Markov Models.
Ab Initio Prediction The ab initio approach is a mixture of science and engineering. The science is in understanding how the three-dimensional structure of proteins is attained. The engineering portion is in deducing the three-dimensional structure given the sequence.
Ab Initio Prediction The biggest challenge with regards to the folding problem is with regards to ab initio prediction, which can be broken down into two components: devising a scoring function that can distinguish between correct (native or native-like) structures from incorrect (non-native) ones, and a search method to explore the conformational space. In many ab initio methods, the two components are coupled together such that a search function drives, and is driven by, the scoring function to find native-like structures.
Protein Structure Prediction Software A great number of structure prediction software are developed for dedicated protein features and particularity, such as disorder prediction , dynamics prediction , structure conservation prediction , etc. Approaches include homology modeling, protein threading, ab initio methods, secondary structure prediction, and transmembrane helix and signal peptide prediction. Choosing the right method always begins by using the primary sequence of the unknown protein and searching the protein database.
Secondary Structure Prediction Tools These tools predict local secondary structures based only on the amino acid sequence of the protein. Predicted structures are then compared to the DSSP score, which is calculated based on the crystallographic structure of the protein (more on the DSSP score here ). Prediction methods for secondary structure mainly rely on databases of known protein structures and modern machine learning methods such as neural nets and support vector machines.
Tertiary structure All the information about a protein’s tertiary structure is encoded in its primary structure (that is, its amino acid sequence). However, an enormous number of them can be predicted, among which only one has the minimal free energy and stability required to be folded properly. Ab initio protein structure prediction thus requires vast amount of computational power and time to solve the native conformation of a protein, and remains one of the top challenges for modern science. Most popular servers include Robetta (using the Rosetta software package), SWISS-MODEL , PEPstr , QUARK .
Tertiary structure If a protein of known tertiary structure shares at least 30% of its sequence with a potential homolog of undetermined structure, comparative methods that overlay the putative unknown structure with the known can be utilized to predict the likely structure of the unknown. Homology modeling and protein threading are two main strategies that use prior information on other similar protein to propose a prediction of an unknown protein, based on its sequence. Homology modeling and protein threading software include RaptorX , FoldX , HHpred , I-TASSER , and more.
Conclusion Proteomics methods are essential for studying protein expression, activity, regulation and modifications. Bioinformatics is an integral part of proteomics research. The recent developments and applications in proteomics are discussed including mass spectrometry data analysis and interpretation, analysis and storage of the gel images to databases, gel comparison, and advanced methods to study e.g. protein co-expression, protein–protein interactions, as well as metabolic and cellular pathways. The significance of informatics in proteomics will gradually increase because of the advent of high-throughput methods relying on powerful data analysis.