Metagenomics

18,251 views 53 slides Apr 24, 2022
Slide 1
Slide 1 of 53
Slide 1
1
Slide 2
2
Slide 3
3
Slide 4
4
Slide 5
5
Slide 6
6
Slide 7
7
Slide 8
8
Slide 9
9
Slide 10
10
Slide 11
11
Slide 12
12
Slide 13
13
Slide 14
14
Slide 15
15
Slide 16
16
Slide 17
17
Slide 18
18
Slide 19
19
Slide 20
20
Slide 21
21
Slide 22
22
Slide 23
23
Slide 24
24
Slide 25
25
Slide 26
26
Slide 27
27
Slide 28
28
Slide 29
29
Slide 30
30
Slide 31
31
Slide 32
32
Slide 33
33
Slide 34
34
Slide 35
35
Slide 36
36
Slide 37
37
Slide 38
38
Slide 39
39
Slide 40
40
Slide 41
41
Slide 42
42
Slide 43
43
Slide 44
44
Slide 45
45
Slide 46
46
Slide 47
47
Slide 48
48
Slide 49
49
Slide 50
50
Slide 51
51
Slide 52
52
Slide 53
53

About This Presentation

Metagenomics is the study of metagenome, genetics material, recovered directly from environmental sample such as soil, water or faeces.
Metagenomics is based on the genomics analysis of microbial DNA directly
from the communities present in samples
Metagenomics technology – genomics on a large sc...


Slide Content

By Assist.Prof Dr. Berciyal Golda.P VICAS

The term metagenomics first used by Jo Handelsman, Jon Clarly, Robert M. Goodman and first appeared in publication in 1998 . Metagenomics defined as “ the genomics analysis of microorganism by direct extraction and cloning DNA from an assemblage of microorganism .” In Greek, meta means “transcendent ” (combination of separate analysis) Geno m ics refers t o the st u dy of the g enome Jo Handelsman

Metagenomics is the study of metagenome, genetics material, recovered directly from environmental sample such as soil, water or faeces. Metagenomics is based on the genomics analysis of microbial DNA directly from the communities present in samples Metagenomics technology – genomics on a large scale will probably lead to great advances in medicine, agriculture, energy production and bioremediation. Metagenomics can unlock the massive uncultured microbial diversity present in the environment for new molecule for therapeutic and biotechnological application. Metagenomic studies have identified many novel microbial genes coding for metabolic pathways such as energy acquisition, carbon and nitrogen metabolism in natural environments that were previously considered to lack such metabolism

The science of metagenomics, only a few years old, will make it possible to investigate microbes in their natural environments, the complex communities in which they normally live. It will bring about a transformation in biology, medicine, ecology, and biotechnology that may be as profound as that initiated by the invention of the microscope. All plants and animals have closely associated microbial communities that make necessary nutrients(carbon, nitrogen, oxygen, and sulfur) metals, and vitamins available to their hosts. We depend on microbes to remediate toxins in the environment—both the ones that are produced naturally and the ones that are the byproducts of human activities, such as oil and chemical spills .

In 1985 Pace and coworker introduced the idea a cloning DNA directly from environmental samples. In 1991 Schmidt and coworker cloning of DNA from Picoplankton in a phase vector subsequent 16S rRNA gene sequence analyses . In 1995, Healy reported first successful function driven metagenomics library was screened and termed that Zoolibraies . In 2002, Mya Breitbart and Forest Rohwer , used shotgun sequencing to show that 200 liters of seawater contain over 5000 different viruses.

Science of metagenomics make it possible to investigate resource for the development of novel genes, enzymes and chemical compounds for use in biotechnology. Microbes, as communities, are key players in maintaining environmental stability. Investigate microbes in their natural environment, the complex communities in which they normally live in. High-throughput gene-level studies of communities.

Sample processing is the first and most crucial step in metagenomics. DNA extracted should be representative of all cells present in the sample and sufficient amounts of high quality nucleic acids must be obtained for subsequent library production and sequencing. Sample fractionation steps should be checked to ensure that sufficient enrichment of the target is achieved and that minimal contamination of non-target material occurs. Physical separation and isolation of cells from the samples might also be important to maximize DNA yield or avoid co-extraction of enzymatic inhibitors that might interfere with subsequent processing. Direct lysis of cells versus indirect lysis has a quantifiable bias in terms of microbial diversity, DNA yield, and resulting sequence fragment length. Some type of sample such as biopsies or ground water often yield very small amounts of DNA but in library production for most sequencing technologies require high amounts of DNA (ng or µg ), and hence amplification of starting material might be required. Multiple displacement amplification (MDA) using random hexamers and phage phi29 polymerase is one option employed to increase DNA yields, this method has been widely used in single-cell genomics and to a certain extent in metagenomics.

Dispersing misconceptions and identifying opportunities for the use of 'omics' in soil microbial ecology James I. Prosser Nature Reviews Microbiology 13, 439–446 (2015)

vo l ca n o Havey metal composition

There are two basic types of Metagenomics studies Sequence-based Metagenomics- involves sequencing and analysis of DNA from environmental samples Function-based Metagenomics involves screening for a particular function or activity

Sequence-based metagenomics studies can be used to assemble genomes, identify genes, find complete metabolic pathways, and compare organisms of different communities. Genome assembly requires lots of computer power but it can lead to a better understanding of how certain genes help organisms survive in a particular environment. Sequence-based metagenomics can also be used to establish the degree of diversity and the number of different bacterial species existing in a particular sample. Analyzing microbial diversity is less costly and less computer intensive than assembling genomes and it can provide valuable information about the ecology of microbes in a sample.

Whole genome sequencing developed by J. Craig Venter and Hamilton Smith in 1995. Whole genome sequencing can help to reconstruct large fragments or even complete genome from organism in a community without previous isolation, allowing the characterization of a large number of coding and non-coding sequence can used as phylogenetic marker. Whole genome sequencing provides information both about which organism are present & what metabolic processes are possible in the community. J. Craig Venter

W HOLE GENOME SEQUENCING Whole genome sequencing based on basic four steps Library construction Random sequencing Fragment Alignment and gap closure Editing

Carl Woese and coworker started to analyze and sequence the 16S rDNA genes of various bacteria, using DNA sequencing, a state-of-the-art technology at that time, and used the sequences for phylogenetic studies. 16S rRNA is a part of the ribosomal RNA of prokaryotic cell which is about 1,542 nucleotide. 16S gene contain region that are highly conserved between species and also variable region that are species specific. I. Conserved region provide excellent amplification targets. II. V ar i a b le r e g ion are h i ghly i nfo r m ative for ta x on o mic classification. Thu s i s t h e p o wer f ul tool us e d for cl a ssific a tion and genome analysis

Ev i de n ce for horizontal gene transfer ex c hange of g en e tic m ateri a l be t w e en two genomes without a parental relationship.

DNA sequencing is one of the most important platforms for the study of biological systems today. (Ronaghi, 2001) A. Next generation DNA sequencing 454 life sciences or pyrosequencing Solexa/Illumina Sequencing by ligation (SOLiD technology) Ion Torrent or PGM

Sequence determination is most commonly performed using di- deoxy chain termination technology, also known as Sanger sequencing, was developed by Frederick Sanger and collègues (Sanger et al., 1977). Pyrosequencing technology is a novel DNA sequencing technology, the first alternative to the conventional Sanger method for de novo DNA sequencing. (Md. Fakruddin et al ., 2012) Pyrosequencing has the potential advantages of accuracy, flexibility, parallel processing, and can be easily automated. ( Md. Fakruddin et al ., 2012)

Pyrosequencing a DNA sequencing technique that relies on detection of pyrophosphate release upon nucleotide incorporation rather than chain termination with dideoxynucleotides. In Pyrosequencing (Nyren and Skarpnack, 2001) the sequencing primer is hybridized to a single-stranded DNA biotin-labeled template and mixed with the enzymes; DNA polymerase, ATP sulfurylase, luciferase and apyrase, and the substrates adenosine 5′ phosphosulfate (APS) and luciferin (Gharizadeh et al., 2007 ). Cycles of four deoxynucleotide triphosphates (dNTPs) are separately added to the reaction mixture iteratively. The cascade starts with a nucleic acid polymerization reaction in which inorganic PPi is released as a result of nucleotide incorporation by polymerase.

Each nucleotide incorporation event is followed by release of inorganic pyrophosphate (PPi) in a quantity equimolar to the amount of incorporated nucleotide. The released PPi is quantitatively converted to ATP by ATP sulfurylase in the presence of APS. The generated ATP drives the luciferase-mediated conversion of luciferin to oxyluciferin, producing visible light in amounts that are proportional to the amount of ATPs. The light in the luciferase-catalyzed reaction with a maximum of 560 nm wavelength is then detected by a photon detection device such as a charge coupled device (CCD) camera or photomultiplier. Apyrase is a nucleotide-degrading enzyme, which continuously degrades ATP and non-incorporated dNTPs in the reaction mixture.

http://www.molecularecologist.com/next-gen-fieldguide-2014/ (Accessed Aug 17, 2015).

Binning is the process of grouping reads or contigs into individual genomes and assigning the group to specific species, subspecies or genus. More innovative binning approaches include co-abundance gene segregation across a series of metagenomic sample thus facilating the assembly of microbial genomes without the need for reference sequences. Important considerations for using any binning algorithm are the type of input data available and the existence of a suitable training dataset or reference genomes. Binning methods can be characterized in two different ways depending on information contained within a given DNA sequence Composition based binning Similarity or homology based binning

Composition based binning is based on the observation that individual genomes have a unique distribution of k-mer sequence is known as genomic signatures. Binning makes use of this conserved species-specific nucleotide composition (such as GC) are capable of grouping sequences into their respective genomes. Compositional based binning algorithms include phylopythia, successor phylopythiaS, S-GSOM, PCAHIER, TACAO, TETRA, ESOM and ClaMS. Composition based binning is not reliable for short reads as they do not contain enough information.

Similarity based binning refer to the process of using alignment algorithms such as BLAST or profile hidden markov models (pHMMs) to obtain similarity information about specific sequences/ genes from publically available databases. Similarity based binning algorithms include IMG/M, MG-RAST, MEGAN, CARMA, Sort-ITEMS and Metaphyler. Similarity based binning fail to do so accurately while reads of short length, the metagenome under consideration consists of numerous closely related species.

Annotation is the process of assigning functional, positional, and species- of-origin information to the genes in a database. Annotation of metagenome is specifically designed to work with mixtures of genomes and contig of varying length. Annotation is included the four preprocessing steps Trimming of low quality reads–using platform specific tool such as FASTX-Toolkit, SolexaQA, and Lucy2 the threshold of which depend on sequencing technology. Masking of low complexity reads-performed using tool such as DUST. A de-replication step that removes sequence that more than 95% identical. A screening –the pipeline provided the option of removing reads that are near exact match to the genome of a handful of model organism including fly, mouse, cow and human. Identification of genes within the reads/ assembly contig, a process often denoted as “ gene calling”

Gene are labeled as coding DNA sequence (CDS) identified using a number of tool including Meta-gene mark, meta-gene or phedia and FragGene scan all of which utilize ab inito gene prediction algorithms. Non-coding RNAs such as t-RNA are predicted using programs like tRNA Scan, ribosomal RNA (rRNA) gene (5S, 16S and 23S) are predicted using internally developed rRNA model for IMG/MER and MG-RAST use similarity to compare known database to predict rRNA. Annotation pipeline involves functional assignment to the predicted protein coding genes. This is currently achieved by homology based searches of query sequences against databases containing known functional and/or taxonomic information. Both IMG/MER and MGRAST are widely used data management repositories and comparative genomics environments. They are fully automated pipelines that provide quality control, gene prediction, and functional annotation.

ADVANTAGES OF ASSEMBLING METAGENOMES ARE : The possibility of analysing the genome context (i.e., operons); Increasing the probability of complete genes and genomes reconstruction, arising the confidence of sequence annotation; Analysis simplification by mapping long contigs instead of short reads (Thomas et al., 2012; Luo et al., 2013; Segata et al., 2013).

NCBI is mandated to store all metagenomic data, however, the sheer volume of data being generated means there is an urgent need for appropriate ways of storing vast amounts of sequences. Tools such as IMG/MER, CAMERA, MGRAST, and EBI metagenomics (which also incorporates QIIME) provide an integrated environment for analysis, management, storage, and sharing of metagenome projects. The GSC is currently investing heavily toward a widely accepted language that shares ontologies and nomenclatures thereby providing a common standard for exchange of data derived from the analysis of metagenomic projects. A suite of standard languages for metadata is currently provided by the Minimum Information about any (x) Sequence checklists (MIxS). MIxS is an umbrella term to describe MIMS (Minimum Information about a Metagenome Sequence) and MIMARKS (Minimum Information about a MARKer Sequence) have been devised, providing a scheme of standard languages for metadata annotation.

Functional metagenomics is a powerful experimental approach for studying gene function, starting from the extracted DNA of mixed microbial populations. A functional approach relies on the construction and screening of metagenomic libraries—physical libraries that contain DNA cloned from environmental metagenomes. Functional metagenomics begins with the construction of a metagenomic library, Cosmid- or fosmid-based libraries are often preferred due to their large and consistent insert size and high cloning efficiency. The information obtained from functional metagenomics can help in future annotation of gene function and serve as a complement to sequence – based metagenomics. Using this function-based approach allows for discovery of novel enzymes whose functions would not be predicted based on DNA sequence alone.

Total metagenomic DNA is extracted from a microbial community sample, sheared, and ligated into an expression vector and is subsequently transformed into a suitable library host to create a metagenomic library. The library is then plated on media containing antibiotics inhibitory to the wild-type host to select for metagenomic fragments conferring antibiotic fragments present in colonies growing on antibiotic resistance. Metagenomic sel e ction m e dia are then PCR a m pl i fied and sequenced us i ng either traditional Sanger sequencing or next-generation sequencing methods. Fi n all y , reads are as s e mbled and a n n o tated in order to iden t ify t he causative antibiotic resistance genes

F UNCTIONAL METAGENOME ANALYSIS Reconstruction of metabolic pathway from enzyme coding gene is a relevant matter in the metagenome analysis. There are two options to perform functional annotation from shotgun sequences, one is using sequencing reads directly and another is read assembly.

G ENE PREDICTION Gene prediction determine which metagenomics reads contain coding sequence. Metagenome assembly, gene prediction and annotation are similar to the framework followed in whole genome characterization (Yandell and Ence 2012, Richardson and Watson, 2013) Gene prediction by three ways Gene fragments requirements Protein family classification De novo gene prediction

M ETABOLIC PATHWAY RECONSTRUCTION Pathway reconstruction of the metagenome data is one of the annotation goals and the term “inter-organismic meta- routes” or “meta-pathways” has been proposed for this kind of analysis (De Filippo et al., 2012). The concept of metabolic pathway in microbial ecology should be understood as the flow of information through different species. Function annotation has to be used to find each gene in an appropriate metabolic context, filling missing enzymes in pathways and find optimal metabolic states to perform the best pathway reconstructions. Programs available are MinPath (Ye and Doak, 2009) and MetaPath (Liu and Pop, 2010). Both use information deposited in KEGG (Ogata et al., 1999) and MetaCyc (Caspi et al., 2014) repositories. Metabolic pathway reconstruction could be completed with information provided by the data context such as gene function interactions, synteny, and copy number of annotated genes to integrate the metabolic potential of consortium.

M ETABOLIC PATHWAY RECONSTRUCTION

T OOL USED IN METAGENOMICS

A PPLICATION OF METAGENOMICS Metagenomics has the potential to advance knowledge in a wide variety of field. II I . Medicine E n g i ne e ring Agriculture IV. Ecology V. Biotechnology.

A P P LIC A TION Metagenomics can improve strategies for monitoring the impact of pollutants on ecosystems and for cleaning up contaminated environments. Increased understanding of bioaugmentation or biostimulation trials to succeed. Recent progress in mining the rich genetic resource of nonculturable microbes has led to the discovery of new gene, enzymes and natural products. The impact of metagenomics is witnessed in the development of commodity and fine chemicals, agrochemicals and pharmaceuticals where the benefit of enzyme catalyzed chiral synthesis is increasingly recognized. Metagenomics libraries are, indeed, an essential tool for the discovery of new enzymatic activities, facilitating genetic tracking for all biotechnological applications of interest for the future. Metagenomics sequencing is being used to characterize the microbial communities. This is part of the human micro-biome initiative with primary goals to determine if there is a core human micro-biome, to understand the changes in the human micro-biome that can be correlated with human health, and to develop new technological and bioinformatics tools to support these goals. It is well known that the vast majority of microbes have not been cultivated. Functional metagenomics strategies are being used to explore the interactions between plants and microbes through cultivation-independent study of the microbial communities.

New enzymes, antibiotics, and other reagents identified More exotic habitats can be intently studied Can only prog r ess as l i brary te c hnology prog r ess e s, inc l uding sequencing technology Improved bioinformatics will quicken analysis for library profiling Investigating ancient DNA remnants Discoveries such as phylogenic tags (rRNA genes, etc) will give momentum to the growing field Learning novel pathways will lead to knowledge about the current nonculturable bacteria to then culture these systems

L IMITATION To much data. Most gene are not identifiable Contamination, chimeric clone sequences Extraction problem Requires proteomics or expression studies to demonstrate phenotypic characteristics Need a standard method for annotating genomes Can only progress as library technology progresses, including sequencing technology. Requires high throughput instrumentation not readily available to most institutions.

C ONCLUSION Metagenomics has benefited in the past few years from many visionary investments in both financial and intellectual terms. The science of metagenomics is currently in its pioneering stages of development as a field, and many tools and technologies are undergoing rapid evolution. The best use of the metagenomics as a tool to address fundamental question of microbial ecology, evolution and diversity and to derive and test new hypothesis. A s dat a sets more an a lys i s, co m plex and sto rage, and comprehensive, beco m e increasi n gly novel tools for visualization will be required.

Metagenomics allows us to discover new genes and proteins or even the complete genomes of non-cultivable organisms in less time and with better accuracy than classical microbiology or molecular methods. In addition to the phenotypic dimension of human biology, such as gene expression profiling, proteomics, and metabolomics, perhaps we need to extend our concept of the human genome to include the more comprehensive and plastic human metagenome in laboratory medicine.

R E F E R E N CE Jo Handelsman (2004) Metagenomics: Application of Genomics to Uncultured Microorganisms Microbiology and Molecular Biology Review, 68: 4, 669-685 Alejandra Escobar-Zepeda, Arturo Vera-Poncede León and Alejandro Sanchez- Flores (2015) The Road to Metagenomics: From Microbiology to DNA Sequencing Technologies and Bioinformatics, Frontiers in Genetics, 6 , 348: 1-15. Asiya Nazir (2016) Review on Metagenomics and its Applications: Imperial Journal of Interdisciplinary Research 2: 3, 277-286. Thomas J. Sharpton (2014) An introduction to the analysis of shotgun metagenomic data Frontier in Plant Science, 5 , 209, 1-14. Torsten Thomas, Jack Gilbert, and Folker Meyer (2012) Metagenomics a guide from sampling to data analysis Microb Inform Exp . 2 : 3. 1-15. Anastasis Oulas et. al. (2015) Metagenomics: Tools and Insights for Analyzing Next Generation Sequencing Data Derived from Biodiversity Studies, Bioinform Biol Insights . 9 : 75–88. Joseph F. Petrosino et. al. (2019) Metagenomic Pyrosequencing and Microbial Identification Clinical Chemistry 55 : 5, 856–866.

C ON T .. Patake R. S. and Patake G. R (2011) A Mini Review on Metagenomics and its Implications in Ecological and Environmental Biotechnology Universal Journal of Environmental Research and Technology, 1: 1-6. Wolfgang R Streit and Ruth A Schmitz (2004) Metagenomics – the key to the uncultured microbes Current Opinion in Microbiology , 7 : 492–498. Jianping Xu (2006) Microbial ecology in the age of genomics and metagenomics: concepts, tools, and recent advances Molecular Ecology 15 , 1713–1731. R.D. Sleator, C. Shortall and C. Hill (2008) Metagenomics Letters in Applied Microbiology 47 : 361–366. Md. Fakruddin (2012) Pyrosequencing- principles and applications, International Journal of Life Science & Pharma Research, 2: 2, 65-76. Ludmila Chistoserdova (2009) Functional Metagenomics: Recent Advances and Future Challenges, Biotechnology and Genetic Engineering Reviews , 26 335-352.