Role of bioinformatics in life sciences research

AnshikaBansal4 4,951 views 39 slides May 26, 2018
Slide 1
Slide 1 of 39
Slide 1
1
Slide 2
2
Slide 3
3
Slide 4
4
Slide 5
5
Slide 6
6
Slide 7
7
Slide 8
8
Slide 9
9
Slide 10
10
Slide 11
11
Slide 12
12
Slide 13
13
Slide 14
14
Slide 15
15
Slide 16
16
Slide 17
17
Slide 18
18
Slide 19
19
Slide 20
20
Slide 21
21
Slide 22
22
Slide 23
23
Slide 24
24
Slide 25
25
Slide 26
26
Slide 27
27
Slide 28
28
Slide 29
29
Slide 30
30
Slide 31
31
Slide 32
32
Slide 33
33
Slide 34
34
Slide 35
35
Slide 36
36
Slide 37
37
Slide 38
38
Slide 39
39

About This Presentation

BIOINFORMATICS AS NEW BIOLOGICAL RESEARCH AREA


Slide Content

1
BY-
ANSHIKA BANSAL
135019

4 Feb 2016 Dept. of Zoology 2

Bioinformatics is the field of science in which
biology and computer science merge to solve
biological problems
Application of computational tools on molecular
data
3

4
Biologists
collect molecular data:
DNA & Protein sequences,
gene expression, etc.
Computer scientists
(+Mathematicians, Statisticians,
etc.)Develop tools, software, algorithms to
store and analyze the data.
Bioinformaticians
Study biological questions by analyzing
molecular data
4

5
interest propelled by necessity to create sequence database

Bioinformatics makes it possible to analyze large
quantities of complex biological data and can be used
to
search biological databases
compare sequences
estimate molecular structures.
6

4 Feb 2016 Dept. of Zoology 7

Gray wolf
Jack Russell
Toy Poodle
Labradoodle
Coyote
English Shepherd
Cocker Spaniel
Red fox
Image Source: Wikipedia Commons

Genetic Data
…TTCACCAACAGGCCCACA…
Extract DNA from
Cells.
Sequence DNA.
Compare
DNA
Sequences to
One Another.
Obtain Samples:
Blood , Saliva, Hair
Follicles, Feathers, Scales.
TTCAACAACAGGCCCAC
TTCACCAACAGGCCCAC
TTCATCAACAGGCCCAC

4 FEB 2016 Dept. of ZOOLOGY 11

12

20/03/2008 Dept. of Pharmaceutics 13

4 FEB 2016 By- ANSHIKA BANSAL 14
IBM 7090 computer
In1960s: the birth of bioinformatics
Margaret Oakley Dayhoff created:
The first protein database
The first program for sequence assembly
There is a need for computers and algorithms that allow:
Access, processing, storing, sharing, retrieving, visualizing, annotating…

15
Watson and Crick
DNA model
Sanger sequences
insulin protein
Sanger dideoxy
DNA sequencing
PCR (Polymerase
Chain Reaction)
1955
1960
1965
1970
1975
1980
1985
ARPANET
(early Internet)
PDB (Protein
Data Bank)
Sequence
alignment
GenBank
database
Dayhoff’s Atlas

4 Feb 2016 Dept. of Zoology 16

Exponential growth in biological data.
Data (genomic sequences, 3D structures, 2D gel
analysis, Microarrays….) are no longer published
in a conventional manner, but directly submitted
to databases.
Essential tools for biological research. The only
way to publish massive amounts of data without
using all the paper in the world.
17

4 Feb 2016 Dept. of Zoology 18

19
1995
1990
2000
SWISS-PROT
database
NCBI
World Wide Web
BLAST
EBI
Human Genome
Initiative
First human
genome draft
First bacterial
genome
Yeast genome

Some databases in the field of molecular biology…
AATDB, AceDb, ACUTS, ADB, AFDB, AGIS, AMSdb,
ARR, AsDb, BBDB, BCGD, Beanref, Biolmage,
BioMagResBank, BIOMDB, BLOCKS, BovGBASE,
BOVMAP, BSORF, BTKbase, CANSITE, CarbBank,
CARBHYD, CATH, CAZY, CCDC, CD4OLbase, CGAP,
ChickGBASE, Colibri, COPE, CottonDB, CSNDB, CUTG,
CyanoBase, dbCFC, dbEST, dbSTS, DDBJ, DGP, DictyDb,
Picty_cDB, DIP, DOGS, DOMO, DPD, DPlnteract, ECDC,
ECGC, EC02DBASE, EcoCyc, EcoGene, EMBL, EMD db,
ENZYME, EPD, EpoDB, ESTHER, FlyBase, FlyView,
GCRDB, GDB, GENATLAS, Genbank, GeneCards,
Genline, GenLink, GENOTK, GenProtEC, GIFTS,
GPCRDB, GRAP, GRBase, gRNAsdb, GRR, GSDB,
HAEMB, HAMSTERS, HEART-2DPAGE, HEXAdb, HGMD,
HIDB, HIDC, HlVdb, HotMolecBase, HOVERGEN, HPDB,
HSC-2DPAGE, ICN, ICTVDB, IL2RGbase, IMGT, Kabat,
KDNA, KEGG, Klotho, LGIC, MAD, MaizeDb, MDB,
Medline, Mendel, MEROPS, MGDB, MGI, MHCPEP5
Micado, MitoDat, MITOMAP, MJDB, MmtDB, Mol-R-Us,
MPDB, MRR, MutBase, MycDB, NDB, NRSub, 0-lycBase,
OMIA, OMIM, OPD, ORDB, OWL, PAHdb, PatBase, PDB,
PDD, Pfam, PhosphoBase, PigBASE, PIR, PKR, PMD,
PPDB, PRESAGE, PRINTS, ProDom, Prolysis, PROSITE,
PROTOMAP, RatMAP, RDP, REBASE, RGP, SBASE,
SCOP, SeqAnaiRef, SGD, SGP, SheepMap, Soybase,
SPAD, SRNA db, SRPDB, STACK, StyGene,Sub2D,
SubtiList, SWISS-2DPAGE, SWISS-3DIMAGE, SWISS-
MODEL Repository, SWISS-PROT, TelDB, TGN, tmRDB,
TOPS, TRANSFAC, TRR, UniGene, URNADB, V BASE,
VDRR, VectorDB, WDCM, WIT, WormPep, YEPD, YPD,
YPM, etc .................. !!!!
20

21
NCBI:
http://www.ncbi.nlm.nih.gov
EBI:
http://www.ebi.ac.uk/
DDBJ:
http://www.ddbj.nig.ac.jp/


23

24

Sequences (DNA, protein)
Genomics
Mutation/polymorphism
Protein domain/family
Proteomics (2D gel, Mass Spectrometry)
3D structure
Metabolic networks
Regulatory networks
Bibliography
Expression (Microarrays,…)
Specialized
25

26
Bookshelf: A collection of searchable biomedical books linked to
PubMed.
PubMed: Allows searching by author names, journal titles, and a
new Preview/Index option. PubMed database provides access to
over 12 million MEDLINE citations back to the mid-1960's. It
includes History and Clipboard options which may enhance your
search session.
PubMed Central: The U.S. National Library of Medicine digital
archive of life science journal literature.
OMIM: Online Mendelian Inheritance in Man is a database of
human genes and genetic disorders (also OMIA).
Literature Databases:

27
DNA (nucleotide sequences) databases
They are big databases and searching either one should produce
similar results because they exchange information routinely.
-GenBank (NCBI): http://www.ncbi.nlm.nih.gov
-DDBJ (DNA DataBase of Japan):
http://www.ddbj.nig.ac.jp
-Yeast: http://yeastgenome.org
Specialized databases:Tissues, species…
-ESTs (Expressed Sequence Tags)
~at NCBI http://www.ncbi.nlm.nih.gov/dbEST
~at TIGR http://tigr.org/tdb/tgi
- ...many more!

28
They are big databases too:
-Swiss-Prot (very high level of annotation)
http://au.expasy.org/

-PIR (protein identification resource) the world's most
comprehensive catalog of information on proteins
http://www.pir.uniprot.org/
Translated databases:
-TREMBL (translated EMBL): includes entries that have
not been annotated yet into Swiss-Prot.
http://www.ebi.ac.uk/trembl/access.html
-GenPept (translation of coding regions in GenBank)
-pdb (sequences derived from the 3D structure
Brookhaven PDB) http://www.rcsb.org/pdb/
Protein (amino acid) databases

BLAST-Basic Local Alignment Search Tool
CLUSTALW-Alignment Tool
GRAIL-Gene Recognition and Analysis
Internet Link
Primer 3-Primer designing Tool
29

4 Feb 2016 Dept. of Zoology 30

1. . Literature RetrievalLiterature Retrieval
2. 2. Biological ApplicationBiological Application
  Nucleotide ApplicationsNucleotide Applications
Information RetrievalInformation Retrieval
  Sequence AnalysisSequence Analysis
Sequence TranslationSequence Translation
Protein ApplicationsProtein Applications
Information RetrievalInformation Retrieval
  Protein AnalysisProtein Analysis
  Structure AnalysisStructure Analysis
31

1. Literature Retrieval
Use PubMed to search journals and other literature on any
biological or chemical item of interest.  Full articles are
not provided in this database, only citations and
abstracts are available to view.
2. Biological Application-
The biological data is stored consistently and is easily
available to the scientific community, the requirement
is then to provide methods for extracting the
meaningful information from the mass of data.
32

3. Nucleotide Applications
Information Retrieval
There are numerous databases around the world containing
information useful for computational biologists.  The main
ones are: the National Center for Biotechnology Information
(NCBI), the European Bioinformatics Institute (EBI), and the
DNA Database of Japan (DDBJ).
33

Sequence Retrieval
Find the nucleotide sequence for a gene of interest
Eg. Mtb rpob gene
>gi|448814763:759807-763325 Mycobacterium tuberculosis H37Rv,
complete genome
TTGGCAGATTCCCGCCAGAGCAAAACAGCCGCTAGTCCTAG
TCCGAGTCGCCCGCAAAGTTCCTCGAATA
ACTCCGTACCCGGAGCGCCAAACCGGGTCTCCTTCGCTAAG
CTGCGCGAACCACTTGA…………………………………..
 Sequence Identification -
Find function and possible origin of gene from a sequence
2.2 Sequence Analysis
With these applications we can align two sequences, align multiple
sequences, and perform phylogenic analyses. One reason we would
do this is to determine what parts of the sequences are conserved
from one species to the next. Another reason would be to see how
much an organism has diverged from other organisms simply by
comparing their DNA sequences. 34

Sequence Translation
Computational biologists need to analyze their nucleotide
sequences, and the best way to do that is to study the protein
product.  The following programs will either convert your DNA
sequence into an amino acid (protein) sequence or it will take
your protein and convert it into its complimentary DNA (cDNA)
sequences.  These protein and DNA sequences can then be
analyzed using other applications on this page.
Translation – Converts nucleotide sequences into protein
sequences.
Backtranslation – Converts protein sequences into nucleotide
sequences or complimentary DNA (cDNA).
35

Protein Applications:
Information Retrieval
The numerous information retrieval sites on the Internet can
give very valuable information concerning the sequence and
properties of a protein.  Numerous databases exist and each
database is accessible through convenient search programs.  This
section will introduce useful sites that provide database search
capabilities.
Protein Sequence Retrieval – Allows user to retrieve sequence
from protein name, accession number, or GI identification
number.
Protein Identification – Allows user to retrieve a protein name
or accession and GI numbers from polypeptide sequence.
36

Protein Analysis
After obtaining the identity or sequence of a protein, there are
several valuable tools that allow further analysis of the
protein.  Information can be obtained concerning the
characteristic properties of the proteins from the
sequence.  Another valuable tool is sequence alignment
applications that establish the degree of similarity between two
proteins or multiple proteins.
Determining Protein Sequence Properties – User can find
molecular weight (MW), isoelectric point (pI), titration curves,
hydrophobicity  etc. for particular protein.
Protein Sequence Alignment – Align a single sequence to
sequences in a database.
Pair-wise Sequence Alignment – Align two protein sequences to
each other.
Multiple Sequence Alignment – Align many sequences against a
single sequence.
37

Structure Analysis
Several programs have been created that give scientists the
ability to look at the three dimensional shape of proteins and
nucleotides. Examining a protein in 3D allows for greater
understanding of protein functions, as well as providing
students with a visual understanding that cannot always be
conveyed through still photographs or descriptions
RasMol, originally developed by Roger Sayle.  To use this
program it must first be downloaded onto your computer.
Cn3D other 3D structure viewer
38

39