CATH is a online tool for biological sequence analysis.
Size: 1.2 MB
Language: en
Added: Nov 02, 2017
Slides: 13 pages
Slide Content
S.Ramya I M.Sc Microbiology CATH
Introduction The CATH database provides hierarchical classification of protein. Domains are obtained from protein structures deposited in the Protein Data Bank. Both domain identification and subsequent classification use manual as well as automated procedures.
Data Accessibility CATH may be seen as more than a resource for acquiring information about single domains only
Database Construction The data in CATH are obtained from PDB files deposited in the Protein Data Bank. The structures can be determined only with a resolution of 4Ǻ or better are included. Further more CATH requires the domains with minimum 40 residues of length with 70% or more side chains.
Two main steps are involved in adding new structures to CATH Submitted protein chains are chopped to obtain the domains. Classification are assigned to the resulting domains.
Structural classification and comparison Class - highest level-placed the selected protein into 1 of 4 categories of secondary structure. Architecture - description of the cross arrangement of secondary structure, independent of topology. Topology - indication of over all shape and connectivity of protein’s secondary structures. Homologous super family- proteins of known structure that are homologous (share a common ancester ) to a selected protein.
CATH Numbering Scheme for representative structures from the Globin -like-fold
The last release of CATH-Gene3D was released in July 2016 and consist of: 308,999 structural protein domain entries 53,479,436 non-structural protein domain entries 2,737 homologous super family entries 92,882 functional family entries