Bioinformatic_Databases_2.ppt Bioinformatics

MohamedHasan816582 31 views 28 slides Apr 29, 2024
Slide 1
Slide 1 of 28
Slide 1
1
Slide 2
2
Slide 3
3
Slide 4
4
Slide 5
5
Slide 6
6
Slide 7
7
Slide 8
8
Slide 9
9
Slide 10
10
Slide 11
11
Slide 12
12
Slide 13
13
Slide 14
14
Slide 15
15
Slide 16
16
Slide 17
17
Slide 18
18
Slide 19
19
Slide 20
20
Slide 21
21
Slide 22
22
Slide 23
23
Slide 24
24
Slide 25
25
Slide 26
26
Slide 27
27
Slide 28
28

About This Presentation

Bioinformatic_Databases_2.ppt Bioinformatics


Slide Content

Biological Databases

Some databases in the field of molecular biology…
AATDB, AceDb, ACUTS, ADB, AFDB, AGIS, AMSdb,
ARR, AsDb,BBDB, BCGD,Beanref,Biolmage,
BioMagResBank, BIOMDB, BLOCKS, BovGBASE,
BOVMAP, BSORF, BTKbase, CANSITE, CarbBank,
CARBHYD, CATH, CAZY, CCDC, CD4OLbase, CGAP,
ChickGBASE, Colibri, COPE, CottonDB, CSNDB, CUTG,
CyanoBase, dbCFC, dbEST, dbSTS, DDBJ, DGP, DictyDb,
Picty_cDB, DIP, DOGS, DOMO, DPD, DPlnteract, ECDC,
ECGC, EC02DBASE, EcoCyc, EcoGene, EMBL, EMD db,
ENZYME, EPD, EpoDB, ESTHER, FlyBase, FlyView,
GCRDB, GDB, GENATLAS, Genbank, GeneCards,
Genline, GenLink, GENOTK, GenProtEC, GIFTS,
GPCRDB, GRAP, GRBase, gRNAsdb, GRR, GSDB,
HAEMB, HAMSTERS, HEART -2DPAGE, HEXAdb, HGMD,
HIDB, HIDC, HlVdb, HotMolecBase, HOVERGEN, HPDB,
HSC-2DPAGE, ICN, ICTVDB, IL2RGbase, IMGT, Kabat,
KDNA, KEGG, Klotho, LGIC, MAD, MaizeDb, MDB,
Medline, Mendel, MEROPS, MGDB, MGI, MHCPEP5
Micado, MitoDat, MITOMAP, MJDB, MmtDB, Mol -R-Us,
MPDB, MRR, MutBase, MycDB, NDB, NRSub, 0 -lycBase,
OMIA, OMIM, OPD, ORDB, OWL, PAHdb, PatBase, PDB,
PDD, Pfam, PhosphoBase, PigBASE, PIR, PKR, PMD,
PPDB, PRESAGE, PRINTS, ProDom, Prolysis, PROSITE,
PROTOMAP, RatMAP, RDP, REBASE, RGP, SBASE,
SCOP, SeqAnaiRef, SGD, SGP, SheepMap, Soybase,
SPAD, SRNA db, SRPDB, STACK, StyGene,Sub2D,
SubtiList, SWISS-2DPAGE, SWISS-3DIMAGE, SWISS-
MODEL Repository, SWISS -PROT, TelDB, TGN, tmRDB,
TOPS, TRANSFAC, TRR, UniGene, URNADB, V BASE,
VDRR, VectorDB, WDCM, WIT, WormPep, YEPD, YPD,
YPM, etc .................. !!!!

What we expect from a database..!!
•Sequence, functional, structural information,
related bibliography
•Well Structured and Indexed
•Well cross-referenced (with other databases)
•Periodically updated
•Tools for analysis and visualization

Biological Databases
•Sequence databases
•Structure databases

Sequence databases
•Nucleotide databases
•Protein databases

Sequence databases

Nucleotide databases
•International Nucleotide Sequence
Database Collaboration (INSDC)
–NCBI
–EMBL
–DDBJ

Standard contents of a sequence
database
•Sequences
•Accession number
•References
•Taxonomic data
•Annotation/curation
•Keywords
•Cross-references
•Documentation

NCBI
•Very comprehensive biological database
•GENBANK: The nucleotide sequence database
•Provides 42 different resource
•Provides a simple and easy to use web
interface
http://www.ncbi.nlm.nih.gov/

•Sequence submission: done using Bankit or
Sequin
•Search Engine for data retrieval: Entrez
•Retrieves information across all the resources
under NCBI
Example: PubMed, taxonomy, SNP, PubChem
etc.

Tools for analysis
•BLAST
•Primer-BLAST
•ORF finder
•Genome workbench

Protein Sequence databases
•UniProt
•PFAM
•Prosite
•Motif scan

UniProt
•Universal Protein Resource
•Formed through the merger of :
–SIB
–EBI-SwissProt
–TrEMBL
–PIR-PSD

•Entry names are often the names of the gene
followed by the species.
•Accession numbers are of the following
format:
•e.g. P26367 (PAX6_HUMAN)

Uniprot features
•Blast
•Align
•Retrieve
•ID mapping

Pfam
•Proteins contain conserved regions
•Based on the conserved regions, proteins are
classified into families
•Provides links to external databases like PDB,
SCOP, CATH etc.

Pfam: Features
•Sequence search
•View Pfam family
•View a clan
•View a sequence
•View a structure
•Keyword search

Gene Indices
•Project aimed at indexing genes and their
variants in the various genome sequences.
•Creating a catalogue of genes in a wide range
of organisms
•Reduce redundancy

Gene Indices Software Tools
•TGI Clustering tools
•Clview
•SeqClean
•Cdbfasta/cdbyank

Structural databases

•PDB –Protein Data Bank
•CATH
•SCOP –Structural Classification of Proteins

wwPDB
•Contains information about experimentally
determined structures of proteins, nucleic
acids, and complex assemblies
•RCSB-PDB, PDBe, PDBj, BMRB –repositories of
protein structure data
•Files in PDB, mmCIF, PDBML/XML formats

•Advanced search –provides comprehensive
information about a protein.
•Sequence info, domain info, sequence
similarity, literature, apart from the details of
the structure.
•Cross referenced to SCOP and CATH

CATH
•Classification of proteins based on domain
structures
•Each protein chopped into individual domains
and assigned into homologous superfamilies.
•Hierarchial domain classification of PDB
entries.

CATH hierarchy
•Class –derived from secondary structure content is assigned
automatically
•Architecture –describes gross orientation of secondary
structures, independent of connectivity
•Topology –clusters structures according to their
topological connections and numbers of secondary
structures
•Homologous superfamily –this level groups
together protein domains which are thought to
share a common ancestor and can therefore be
described as homologous

SCOP
•Description of structural and evolutionary
relationships between all the proteins with
known structures
•Uses the PDB entries
•Search using keywords or PDB identifiers

Hierarchy in SCOP
•Class
•Fold
•Superfamily
•Family
•Species

Thank you
Tags