Comparative genomics

JajatiKeshariNayak1 4,026 views 40 slides Oct 05, 2019
Slide 1
Slide 1 of 40
Slide 1
1
Slide 2
2
Slide 3
3
Slide 4
4
Slide 5
5
Slide 6
6
Slide 7
7
Slide 8
8
Slide 9
9
Slide 10
10
Slide 11
11
Slide 12
12
Slide 13
13
Slide 14
14
Slide 15
15
Slide 16
16
Slide 17
17
Slide 18
18
Slide 19
19
Slide 20
20
Slide 21
21
Slide 22
22
Slide 23
23
Slide 24
24
Slide 25
25
Slide 26
26
Slide 27
27
Slide 28
28
Slide 29
29
Slide 30
30
Slide 31
31
Slide 32
32
Slide 33
33
Slide 34
34
Slide 35
35
Slide 36
36
Slide 37
37
Slide 38
38
Slide 39
39
Slide 40
40

About This Presentation

Genome Comparison Techniques, Advanced & Classical Approaches.


Slide Content

SEMINAR ON Genomics Genome Comparison Techniques, Advanced & Classical Approaches Presented By Jajati Keshari Nayak Dept. Molecular Biology GB PANT UNIVERSITY PhD 1 st yr ID-55493

GENOMICS Field of biology that attempts to understand the content, organization, function and evolution of genetic information contained in the whole genome Three Levels of Genome Research

Structural Genomics EXON1 EXON2 EXON3 Structural genomics  seeks to describe the structural features of genes & 3-dimensional structure of every protein encoded by a given genome Sequence and size in bp Map/ position Repeats Motifs/ domains etc

Functional genomics U se of the vast wealth of data produced by genomic projects (such as genome sequencing projects) to describe gene (and protein) functions and interactions. Study of Functional transcript and protein product encoded by genes of a genome & their functional characterization Role of gene Time and tissue specific expression Eg .- Gene encoding Florigen hormone in plants

What is Comparative Genomics? Analyzing & comparing different genomes for studying the gene content , function, organization & evolution of different organism Not a core technique but application of various techniques of genome sequencing, mapping, bioinformatics etc with the sole objective of comparing genomes

What to compare? • Size of the genome: total number of base pairs Genome organization: circular, linear, ss / ds genome, extra chromosomal elements etc . Percentage of the genome (coding) • Total number of predicted ORFs Average length of ORF Repetitive DNA/ junk DNA • Functional assignment

• Paralogs & orthologs • Genomic organization & gene location/order • Gene structure – Exon number – Exon lengths – Intron lengths – Sequence similarity • Gene characteristics – Splice sites – Codon usage – Conserved synteny What to compare contd..Units of comparison

Basic points of consideration in comparative genomics All existing genomes had a common ancestor and that each organism is a combination of ancestor and the action of evolution. New genes are derived from existing sequences Information gained in one organism can have application in other even distantly related organisms. The existing variation among current forms of living organisms is due to Selection Speciation Divergence Gene mutations/ duplications etc .

New genes are derived from existing sequences

• U nderstanding the similarity & difference between the genomes that lead to special phenotypes or diseases . • Identifying genes and discovering their functions by studying their counterparts in other organisms . • Revealing the evolutionary relationships between different organisms . Purpose or Goals of Comparative Genomics

“Nothing in biology makes sense except in the light of evolution” Theodosius Dobzhansky (1900 – 1975) All modern biological processes evolved from related processes. Every modern gene evolved from other genes Every gene has an ortholog in related species most genes have paralogs in the same species.

Patterns of Gene Evolution CASE I CASE II Gene Orthologs Gene paralogs Homologous genes

CASE III Homologous genes arise from common ancestral organism and show structural similarity (sequence) G1A & G2A are paralogs G1A & G1B OR G2A G2B are orthologs

Great deal of information on an organism can be extracted by examining their counterparts in simpler model organisms I s conducted using model organisms Model organisms offer a cost-effective way to follow the inheritance of genes through many generations in a relatively short time. Mammals: Homo sapiens, mouse Insects: Drosophila melanogaster Roundworms: C. elegans Fungi: Saccharomyces cerevisiae Bacteria: Escherichia coli Fish: Zebrafish Arabidopsis thaliana : Model plant genome Rice: Model cereal genome Medicago trancatuala : Model tree genome How did it all start? The Concept of Model organisms OR Model genomes

Genomes sequenced First bacterial genomes sequenced H.influenzae and M.genitalium The yeast genome 1995 1996 1997 E.coli K12 1998 C.elegans 1999 Full sequence of chr. 22 2000 D.melanogaster Genome & Chr. 21 Human draft 2001 A.thaliana Mouse Ciona Rice Fugu Anopheles 2002 2003 Chimpanzee 2004 2005 Human finished Rat Chicken Xenopus Zebrafish

Techniques of Comparative Genomics Use of molecular markers in plant genome analysis Comparative genome maps Studying Synteny Whole genome sequencing Bioinformatics tools: Homology search in public databases ( BLASTn , BLASTp etc) Sequence alignment tools ClustalW ClustalX etc

Molecular markers in plant genome analysis Comparative analysis of Aromatic rice varieties

Comparative maps for genome comparison Involves the use of molecular markers to map the genomes of two species for a common set of markers (loci) To study genome evolution–how the genome has been rearranged through time–and to make inferences about gene organization, repeated sequences, etc Overview of steps A map is constructed for a species using set of markers Align the map to reference map/ published map of related species Work out common loci/ regions

Sorghum linkage map produced from maize RFLP genomic probes Map location of of RFLP loci in maize Comparative map Maize Vs Sorghum Comparative Genome Mapping of Sorghum and Maize

Gramene : A versatile tool for comparative mapping

Synteny Maps for Comparative Genomics Map showing syntenic regions and homologous loci from another species aligned against a map of a target organism. Can give us information about shared ancestry, evolutionary history, or a key to functional relationships between genes. Genes present in one species are likely to be present in closely-related species. Synteny Maps Synteny : defined as the preservation of the order of genes on a chromosome

Goff et al ( 2002 Science 296: 92-100 ) Rice- Maize comparative synteny map Rice shows great synteny with other cereals i.e. genes present in one cereal will almost certainly be present in the same order in another Regions of homology between rice and maize of greater than 80%. Virtually every part of the maize genome finds a homologue in rice

Approximately 99% of mouse genes have a homologs in the human genome. For 96% the homologue lies within a similar conserved syntenic interval in the human genome. Conservation of synteny between mouse and human genomes Mural et al., Science, 2002, 296:1661 Mouse chromosome 16 is syntenic with: Chr.’s 3,8,12,16,21,22 of Humans Chr.’s 10,11 of Rat

Comparison of mouse chromosome 16 and the human genome Q: Why more breakpoints in mouse-human than in mouse-rat? Q: Why more conserved genes in human than in rat? • The longer the divergence time between 2 species, the more recombination has occurred • 100 million years since human-mouse divergence • 40 million years since rat-mouse divergence

Bioinformatics Approaches : Homology search in public databases Case: We have a gene/ protein sequence with no idea of its function Subject sequence to homology/ sequence similarity search across database (BLAST search) Look for the genes showing significant similarity Putative function can be inferred Homology in bioinformatics = sequence similarity (expressed in % sim / identity) BLAST: Basic Local Alignment Search Tool

An Example Isolate the fragment Get it sequenced Subject the sequence to BLAST Get idea on the role/ function of the novel protein Correlate with phenotype and confirm the findings M 1 2 3 4 5 5 rice varieties grown under heat stress Isolate protein from individual plants Analyze on SDS PAGE

www. ncbi .nlm.nih.gov

Use of Sequence Alignment tools in comparative genomics Sequence alignment is a way of arranging the sequences to identify regions of similarity that may be a consequence of functional, structural, or evolutionary relationships between the sequences SEQ1 SEQ2 SEQ1 SEQ2 Pairwise alignment Multiple sequence alignment Progressive multiple alignment techniques produce a phylogenetic tree used to work out evolutionary relationship

No ! This is the beginning of other advanced comparative genomics approaches What if target sequence shows no similarity with existing sequences Dead End ?

Prerequisites: Enough data & tools for processing large amounts of data Development of new computational methods Advanced statistical tools Knowledge of Algorithms “ Informatics” techniques from applied maths , computer science and statistics adapted to biological sequences Advanced Techniques of Comparative Genomics

Genes of related function are associated in various ways Enzymes in a pathway, proteins in a complex Group of genes having similar biochemical function tend to remain localized E.g. Genes required for synthesis of tryptophan ( trp genes) in E. coli and other prokaryotes Whatever a gene’s associates do, the gene probably does the similar function Prediction of functions via ‘ guilt by association ’ principle Proverbial principle: Show me your friends and I’ll tell you who you are’

Plant association studies evidence is mainly post-genomics studies Some plant post-genomic resources: Microarray analysis Organellar targeting prediction proteomics & phenomics databases

Genome OnLine Database

Useful Websites COMPARATIVE GENOMICS VISUALIZATION TOOL VISTA ( www-gsd.lbl.gov/ vista ) PipMaker (http://pipmaker.bx.psu.edu/pipmaker/) WHOLE GENOME ANNOTATION BROWSER NCBI Map Viewer ( ww cbi .nlm.nih.gov/projects/ map view ) UCSC genome browser ( genome . ucsc .edu ) Ensemble ( www.ensembl.org ) WHOLE GENOME COMPARATIVE GENOME BROWSER UCSC genome Browser ( genome . ucsc .edu ) VISTA genome browser ( www-gsd.lbl.gov/ vista ) PipMaker ((http://pipmaker.bx.psu.edu/pipmaker/) CUSTOM COMPARISION TO WHOLE GENOME Genome Vista (AVID) (http://pipline.lbl.gov-cgi-bgenomevista) UCSC genome browser ( genome . ucsc .edu ) ENSEMBLE (SSAHA) ( www.ensembl.org ) NCBI (BLAST) . ncbi.nlm.nih.gov /BLAST )

• Locus Link/ RefSeq http://www.ncbi.nih.gov/LocusLink/ • PEDANT - Protein Extraction Description ANalysis Tool http://pedant.gsf.de/ • MIPS –mammalian Protein Interaction Database http://mips.gsf.de/ • COGs - Cluster of Orthologous Groups (of proteins) http://www.ncbi.nih.gov/COG/ • KEGG - Kyoto Encyclopedia of Genes and Genomes http://www.genome.ad.jp/kegg/ • MBGD - Microbial Genome Database – http://mbgd.genome.ad.jp/ • GOLD - Genome OnLine Database – http://www.genomesonline.org/ • TOGA (TIGR Orthologous Gene Alignment) – http://www.tigr.org/tdb/toga/toga.shtml General Databases Useful for Comparative Genomics

What we have learned by comparing genomes It will help us to understand the genetic basis of diversity in organisms, both speciation & variation & important aspects of evolutionary biology. Provides “first pass” information on the function of the putative gene based on the existence of conserved protein sequences. Comparative genomics provides a powerful way in which to analyze sequence data.

THANK YOU
Tags