Genome sequencing. ppt.pptx

541 views 57 slides Mar 18, 2023
Slide 1
Slide 1 of 57
Slide 1
1
Slide 2
2
Slide 3
3
Slide 4
4
Slide 5
5
Slide 6
6
Slide 7
7
Slide 8
8
Slide 9
9
Slide 10
10
Slide 11
11
Slide 12
12
Slide 13
13
Slide 14
14
Slide 15
15
Slide 16
16
Slide 17
17
Slide 18
18
Slide 19
19
Slide 20
20
Slide 21
21
Slide 22
22
Slide 23
23
Slide 24
24
Slide 25
25
Slide 26
26
Slide 27
27
Slide 28
28
Slide 29
29
Slide 30
30
Slide 31
31
Slide 32
32
Slide 33
33
Slide 34
34
Slide 35
35
Slide 36
36
Slide 37
37
Slide 38
38
Slide 39
39
Slide 40
40
Slide 41
41
Slide 42
42
Slide 43
43
Slide 44
44
Slide 45
45
Slide 46
46
Slide 47
47
Slide 48
48
Slide 49
49
Slide 50
50
Slide 51
51
Slide 52
52
Slide 53
53
Slide 54
54
Slide 55
55
Slide 56
56
Slide 57
57

About This Presentation

An insight review assignment on: Progress in Plant Genome Sequencing


Slide Content

Presentation type: Insight review Name: Gedifew Gebrie ID.No .: BDU1500203 Program: PhD January, 2023

1. Introduction With advances in DNA sequences, biological understanding and biological knowledge application has been increased : Sanger sequencing in the 20 th century: With a limited quality and volume of data NGS in the 21 st century: Advances short-read sequences via physical mapping Results in long read sequences for large genome fragments Improves quality and volume of data to be generated

2. Diversity of Plant Genomes Plant genomes vary enormously in terms of: size, even within closely related groups of plants content of repetitive sequences level of gene duplication their ploidy and heterozygosity : studying genetic variation The genome variability provides: challenges and difficulty of genome sequencing and assembly.

3.Applications of Plant Genome Sequencing Used to study related, but more complex, species . Arabidopsis thaliana The first plant to have a sequenced genome . It was chosen because it is a small plant with a rapid generation time and a very small genome it is an ideal model plant for research use. 3.1. Model Genomes:

3.1. Model Genomes (Cont.) Rice ( Oryza sativa ) The first crop plant with a sequenced genome It was chosen because it is a major food crop plant with a relatively small genome. Became a model for cereal and grass genomes.

3.1. Model Genomes (Cont.) Brachypodium distachyon But, recent advances in genome sequencing have greatly reduced the need for models as it is now possible to sequence most species easily. Sequenced as a model grass genome Especially relevant for the wheat genome.

3.2. Crop plant Genomes Sequencing C r o p Plant Genome s are becoming a key enabling tool for plant improvement. Involves : Production of a reference genome sequence for a species, and re-sequencing of many individual crops to define allelic variations within a species.

3.2. Crop plant Genomes (Cont.) A single reference genome sequencing cannot always significantly important for crop improvement. So pan-genomes are being produced as a breeding platforms . Focusing on a gen pool that contains a core genome (shared among all individuals) and the variable or dispensable genome ( absent from one or more individuals).

3.3. Sequencing Plant Biodiversity Awareness on plant diversity within a populations is necessary for their effective conservation. The DNA sequencing coverage of plant orders is high The genome sequencing at family level is increasing Sequencing at the genus level is still very low for most plant groups

3.3. Sequencing Plant Biodiversity Sequencing of a plant diversity could be enhanced through: Top -down approach: sequencing from plant family , then each genus and, finally, each species. It is a systematic effort of sequencing plant genomes. Re-sequencing of the plant diversity within each species.

3.4. Sequencing Rare and Threatened Species Serves as a tool to aid conservation in the management of endangered wild crop relatives . (at In situ and ex situ ). Aims at generating life history, demographic and phylogenetic data t o conserve critically endangered species

3.5. Steps of sequencing and assembly of a plant genome DNA e xtraction : to p r oduce a DNA sa m p l e that i s su i ta b le for s equencing S equencing of the DNA : p r oduces lon g r ead sequences, S el f- ass e mbl ing of the sequence reads i n t o sequence co n t i gs A ssembling the sequence contigs at ch r omosome level using: ch r o m ati n m a pp i n g or g e n e t ic ma pp in g

3.5. Steps of sequencing and assembly of a plant genome (Cont.) Plant DNA DNA extraction DNA Sequencing Assembly of Sequences Chromosome Assembly sequence contigs Sequence Reads Chromosomes Sampling

4. Sequencing Technology 4.1. DNA Isolation The minimum and pure DNA for efficient sequencing and the generation of large volumes of data. Higher amount of DNA for long read sequencing than needed for short read sequencing. For long read sequencing, the DNA must not be degraded Pure DNA may be obtained from the nucleus, though it is difficult in plant due to the presence of plant cell wall.

4.2. Short Read Sequences Provides larger volumes of short DNA sequences. Produces reads that are shorter in length Their accuracy and volume of data advanced with the development of different sequencing technologies. Have different drawbacks: Introduce biases into the samples Fail to sequence a highly complex and repetitive genome Demands high computer power, novel algorithms and partial complementation and verification with high-quality sequencing technologies.

4.3. Long Read Sequences Also called single-molecule real-time (SMRT) sequencing or continuous long reads (CLR). Simplifies genome assembly and Improves the length of DNA sequences and their accuracy. Allows much longer sequences to be generated. C apable of sequencing single molecules without the requirement for DNA amplification, as this separates them from all other previous technologies . Are called third-generation sequencing technologies

4.3.1. Pacific Biosciences ( PacBio ), 2011 PacBio is An American biotechnology company, commercializing SMRTS technology. Involves measuring the real time incorporation of nucleotide , identified with light emission immediately the polymerase incorporates each nucleotide. has developed a long-read sequencing platforms by replacing single long reads by HiFi reads the HiFi reads provide a consensus sequence based on multiple sequencing of a long fragment of DNA

4.3.2. ONT (Oxford Nanopore Technologies, 2014 Determines the sequence by measuring the changes in electrical currents as the DNA is passed through a nanopore . It doesn’t need PCR or chemical labeling. Has portable instruments, widely applied to very rapid sequencing Combines with other methods, achieves the chromosome-level assemblies of plant genomes Has a lower rate when compared with SMRT sequencing

4.3.2. ONT (Oxford Nanopore Technologies (Cont . ) Uniquely , it enables direct, real-time analysis of short to ultra-long DNA fragments in fully scalable formats . Involves inexpensive sample preparation requiring minimal chemistries or enzyme-dependent amplification. It eliminates the need for nucleotides and polymerases or ligases during readout.

4.3.3. Other long read sequence technologies developed pseudo-long reads that are created by linking short reads . Produce long reads at lower costs But, the generated long reads often do not match the accuracy of the current long read methods.

4.3.4. Advances in long read sequences The reduced size of assembled genomes slightly with more contiguous assemblies, the joining of homologous contig ends. I mprovements in the assembly of long read sequences, due to the availability of improved softwares . G eneration of long contigs of highly accurate sequences, reducing reliance on these techniques for high-level assembly. Improves quality of the earlier genome long read sequences relative to the latest long read DNA sequencing technology.

4.4. Chromosome-Level Assembly Assembly of the sequence contigs which were generated from the sequencing reads into the chromosome. G enerate many full-length chromosomes from the sequence data alone with the recently advances in sequencing technology. Achieved by using different techniques: G enetic mapping C hromatin mapping (Hi-C) or O ptical mapping.

4.5. Haplotype-Resolved (true diploid) Genomes A haplotype is a combination of alleles at multiple loci that are transmitted together on the same chromosome . Haplotype-resolved genome sequencing enables (NHGRI, 2023): the accurate interpretation of desired genetic variation deep readings regarding population history and non-invasive prediction of undesired (fatal) genomes .

4.5. Haplotype-Resolved Genomes (Cont.) With the advances in both sequencing technology and sequence assembly tools , it become possible to assemble each haplotype separately. Most genomes can now be sequenced at the haplotype level, thereby replacing the reporting of collapsed genomes with the sequences of the two haplotypes. they can be used to better interpret allele-specific expression in diploid transcriptome profiles .

4.6. Pan-Genomes T he complete set of genes found within a population, that includes all of the variations. Pangenome sequencing offers a wider understanding of the crop gene pools genetic diversity. Thus, useful for the crop improvement. Integrating with QTL/GWAS and genome re-sequencing, used to identify important genes and alleles to be used in an effective breeding strategy.

4.7. Transcriptomes Transcriptomes are: protein-coding part of an organism's genome. messenger RNA (mRNA) transfer RNA ( tRNA ) ribosomal RNA ( rRNA ), and other noncoding RNA molecules

4.7. Transcriptomes (Cont.) Transcriptome sequencing: analysis of the expressed regions of a genome at the cell and tissue level . Allows for the discovery of the genes that control plant traits and important for the discovery of genes for selection in plant breeding. But, it had limited application in plants, due to difficulty in isolating specific plant cells without disrupting expression.

4.7.1. RNA Sequencing E xamine the quantity and sequences of RNA in a sample using next-generation sequencing (NGS). It analyzes the transcriptome , indicating which of the genes encoded in our DNA are turned on or off and to what extent . L argely replaced earlier array-based or gene by gene analysis tools as it provided a more unbiased analysis of the whole transcriptome .

4.7.1. RNA Sequencing (Cont.) Examples : The highly polyploid sugarcane genome, revealed that: different alleles of most genes are expressed in direct proportion to their abundance in the genome some genes show highly biased patterns of expression. H exaploid wheat subgenome -specific responses to diseases have also been reported

4.7.2. Long Read Transcriptomes R eveals the diversity of full-length transcripts and defines their variations in gene expression ( splicing and intron retention). Avoids the challenge of the assembly of many closely related transcripts from short reads, reducing the risk of incorrectly combining the ends of the transcripts when using short reads. Example : Coffee (4x), Wheat (6x), Strawberry (8x) and Sugarcane (12x)

4.8. Organelle Genome Sequencing Plant cells usually contain a single nucleus and many organelles. So, organelle genome sequencing is complicated by the transfer of genes between different organelles. It is difficult to distinguish organellar gene sequences from the nuclear insetrs ’ genome sequences. because they relied on PCR amplification or organelle separation.

4.8.1. Chloroplast Genomes Chloroplasts found in all green plants in the high copy numbers in the cell. Simplifying the detection of chloroplast sequences. Widely, used in plant identification Earlier, there was a confusion due to the copies of chloroplast sequences in the nuclear and mitochondrial genomes . The development of software tools allows the current efficient extraction of accurate whole chloroplast genome sequences. Chloroplast transfers between species during rare events results in “chloroplast capture” by closely related species.

4.8.2. Plant Mitochondrial Genomes The mitochondrial genomes of plants: M uch larger and less conserved than the chloroplast genomes may include sequences derived from the chloroplast as in nuclear genome. Thus, M uch less studied than chloroplasts. M ore likely to be confused with chloroplast genome. sequences than chloroplast sequences with nuclear genomes.

5. Biological Understanding Sequencing plant genomes provides an enhanced understanding of the biology of plant species. informs the better conservation of biodiversity assists in improving the management of crops that their genome explain their response to the environment.

5.1. Whole Genome Duplications Whole-genome duplication (WGD) is a process of genome doubling that supplies raw genetic materials and increases genome complexity. facilitates phenotypic evolution and is  advantageous for adaptation . aid in the determination of evolutionary relation-ships . The selection of key genomes for sequencing allows evolutionary relationships to be defined.

5.2. Polyploidy Challenges Polyploidy plant genome sequencing is more complicated than monoploid due to the presence of many similar sequences that can be difficult to assemble. Example: A genome sequence for the diploid Robusta coffee (2n = 22) was reported in 2014 , but still the sequencing of the tetraploid Arabica coffee (4n = 44) has been more difficult.

5.3. Genomics of Plants with Diverse Reproductive Biology I t has been found that there is a large differences between male and female sex chromosomes in dioecious and apomictic plants. Due to the presence of many sex-specific genes as observed in jojoba ( coffeeberry ). The sequence is possible by: chromosome-level assembly of male and female genomes, not possible with more limited genomic information. S equence of transcriptomes (full range of mRNA).

5.4. Evolutionary Insights The evolusionary relationships between plant is determined by the comparison of the genome sequences of plant species. Comparing the sequences of many conserved single-copy genes is one approach. It has been used to define relationships between species in the rice and Sorghum genera. S equencing provides a better understanding opportunity of the process of domestication

5.5. Maternal Genome Inheritance In most plants , chloroplasts and mitochondria are maternally inherited. In some rare cases , chloroplasts are inherited maternally while the mitochondria are paternally inherited: Cucumber The maternal inheritance of organellar DNA and the paternal inheritance of nuclear DNA . result in disagreeing patterns of evolution . to better determine relationships through phylogenetic analysis: Wide genome sequencing data is preferable, if not; nuclear genomes sequencing data, b ecause nuclear genomes show so many variations at the sequence level

5.6. Importance of Genome Size Plant genomes vary widely in size: more than a 1000-fold variation among flowering plants more than 100-fold with in a single plant family. The biological significance of these variations remains poorly understood. Could be enhanced with improvements in genome sequencing technology

6. Enabling Plant Breeding The availability of plant genomes facilitates the breeding and selection of plants. management of plant genetic resources in seed banks , and C onservation of wild crop relatives, provide the genetic resources understanding of plant biology, and the molecular and genetic basis of plant traits . facilitating rapid identification and selection of the key genes that determine the traits of great importance in food crops.

6.1. Molecular Markers and Plant Selection Sequencing has changed the approaches to marker development and applications in plant breeding. sequence-based markers [104] for the differences in genomes that are actually responsible for the traits under selection increases the reliability of the selection. avoid decisions on which genetic markers to select for analysis needing to be made in advance by capturing all of the possibilities.

6.2. Genetic Manipulation Genome sequencing is the ultimate method for the characterization of genetically modified plants . Revealing the exact changes in the produced genomes. Example: In gene editing T he sequencing of transgenic genotypes defines both the exact point of insertion and the sequence of the added genes .

6.3. Editing Plant Genomes S equencing of the genomes of crop species has become a key enabling tool for plant improvement. G enome editing is being aided by the use of genome sequencing: supporting the better targeting of gene editing and ensuring the avoidance of off-target effects . confirming that the intended changes have been made and no other unintended changes have occurred . to make the required change in a phenotype

6.4. Biotechnology Applications (Food, Medicinal and Industrial Crops ) Genome sequencing is a key tool that enables the rapid production of plant varieties with enhanced improvement. I dentification of novel genotypes with the desired traits and supports genome editing to produce plants with novel traits . E nsuring compliance with regulations that govern plant genetic manipulation.

7. IP Issues The Convention on Biological Diversity empowered countries to claim ownership of their biological resources. IP is difficult for historical collections because prior informed agreement cannot be obtained for the sample collection. The genome sequencing data: Supports the protection of plant varieties through PBRs by confirming the identity of genotypes. Contributes for establishing the distinctness of genotypes to secure PBRs

8. Future Prospects Dropping the costs and improving quality of sequencing data. Advancing sequencing technology and making data storage and handing more effective . Development of more general methods that can be applied to a wide range of plant samples. Sequencing of all plant species . Sequencing of rare and endangered species

Thanks for your attention!

Timelines of Genome sequencing 1869 : Isolation of DNA by Friedrich Mietscher from nuclei of human pus cells . 1953 : structure of DNA by Watson , Crick and Franklin 1965 : the first sequenced tRNA and maping its structure by Robert Holley 1972 : S equencing the DNA of a ‎Bacteriophage MS2 complete gene via electrophoresis/chromatography by Walter Fiers

Timelines of Genome sequencing(Cont.) 1977 : Start of first generation sequencing (DNA sequencing method utilizing radiolabelled partially digested fragments called “chain termination method”) by Fredrick Sanger . 1977 : Method for DNA sequencing based on chemical modification of DNA by Maxam and Gilbert involved using chemicals that break the DNA sequence at specific bases but DNA polymerase in Sanger sequencing. B oth Sanger sequencing and Maxam –Gilbert sequencing lacked automation and were time consuming and tiring . 

Timelines of Genome sequencing(Cont.) Then need of working on automation 1984 , the first sequencing technology platform that did not rely on radioactive labelling by Fritz Pohl established : the direct blotting electrophoresis system GATC1500 . 1987 : A utomation of the Sanger sequencing process by Leroy Hood and Michael Hunkapiller with two major improvements:

Timelines of Genome sequencing(Cont.) Firstly, DNA fragments were labelled with fluorescent dyes instead of radioactive molecules. Secondly, data acquisition and analysis was made possible on the computer. Developed the sequencing instrument called ABI 370 

1998 : A new sequencing-by-synthesis method that utilizes fluorescent dyes by Shankar Balasubramanian and David Klenerman 2005: P yrosequencing technology in an automated system by Jonathan Rothberg and their colleagues: implemented the 454 system: the first next generation sequencing platform to come to market . The foundation of the company “ Solexa ” Timelines of Genome sequencing(Cont.)

2007: Illumina acquired the company Solexa They set out to provide the most widely used NGS technology in the world they are the NGS platform market leader to this day. : SOLiD system’s “ sequencing-by-ligation ” Timelines of Genome sequencing(Cont.)

2010 : Single molecule real-time ( SMRT ) sequencing technologies using DNA polymerase as the sequencing engine by Heather and Chain Pacific Biosciences, Inc. ( PacBio ) the pioneer of third generation sequencing technology. involves incorporation of a single nucleotide and labeling each nucleotide with a different fluorescent dye. Timelines of Genome sequencing(Cont.)

2011: Ion Torrent by Life Technologies “ sequencing-by-synthesis ” technology detects hydrogen ions when new DNA is synthesized. 2012 : P ortable handheld system for RNA and DNA sequencing for reads of more than 2 Mb. Oxford Nanopore Technologies’ systems and Illumina’s NovaSeq platforms. Timelines of Genome sequencing(Cont.)