A brief overview of gene sequencing with elaboration on gene sequencing technologies.
Size: 11.96 MB
Language: en
Added: Jan 05, 2020
Slides: 72 pages
Slide Content
GENE SEQUENCING Presented by: Group 1
What is gene sequencing? Sequencing means to determine the primary structure of an unbranched biopolymer. Gene sequencing is the technique that allows researchers to read the genetic information found in DNA. Sequencing involves determining the order of bases.
Importance of gene sequencing: Sequencing the gene is an important step toward understanding the gene. A gene sequence contains some clues about where genes are. Gene sequencing give us understanding how the genome as a whole works-how genes work together to direct the growth, development and maintenance of an entire organism. It help scientists to study the part of genome outside the genes-regulatory regions
History: First sequenced genome was ᵩ X 174 bacteriophage in 1977 (5,375 bp) by Fred Sanger. In 1995: Haemophilus Influenza ; (1,830,137 bp) Mycoplasma genitalium; (580,000 bp) In 1996: Saccharomyces cerevisiae ; (12,068,000 bp) In 1997: Escherichia coli ; (4,639,221 bp) In 1999: Human chromosome ; (53,000,000 bp) In 2000: Drosophila melanogaster ; (180,000,000 bp) In 2001: Human; Working draft; (3,200,000,000 bp)
In 2002: Plasmodium falciparum ; (23,000,000 bp) Anopheles gambiea ; (278,000,000 bp) Mus musculus ; (2,500,000,000 bp) In 2003: Human; finished sequence; (3,200,000,000 bp) In 2005: Oryza sativa ; (489,000,000 bp) In 2006: Populus trichocarpa ; (485,000,000 bp)
Characteristics of an ideal nucleic acid technology: An ideal nucleic acid technology technology should Sequence natural DNA or RNA fragments of various sizes fully with minimal or no error. Generate no sequence dependent or other biases for optimal qualification and genotyping performance. Require minimum quantities of input material, ideally single cell. Have high or complete efficiency in all of the sequencing steps. Detect modified nucleotides directly
Operate in laboratories and low-resource settings without special external environment requirements such as temperature and humidity control. Be sufficiently affordable and economically viable to allow worldwide adoption and operation in research and clinical settings. With the introduction of the SGSTs and subsequent rapid improvement in capabilities, we are now closer, but still far away, from the ultimate goal of rapid, comprehensive and unbiased sequencing of DNA or RNA molecules.
GENE SEQUENCING TECHNOLOGIES
First Generation Sequencing
Sanger Sequencing Introduction Sanger method used in the In-vitro DNA replication. Sanger sequencing technique is developed by British biochemist Frederick and his team in 1977 . Frederick Sanger received Nobel prize for development of Sanger sequencing 1980.
In Sanger dideoxy terminator sequencing, the sample DNA is used as a template for a DNA polymerase. Four polymerase reactions are carried out involving enzyme, primer and sample DNA, along with dNTP's. Each reaction also contains one of the four dideoxy NTP's. When a dideoxy NTP is added, chain lengthening terminates because ddNTP nucleotides lack 3' hydroxyl groups by which to form the next phosphodiester bond. Each reaction contains one of the four bases as a dideoxy NTP, thus each reaction results in fragments terminating at that base. The four reactions produce four collections of fragments with lengths reflecting the sequence positions of each of the four respective bases.
Maxam G ilbert sequencing technique developed by A llan M axam and W alter G ilbert in 1976. Maxam Gilbert Sequencing
Second Generation Sequencing
Introduction The era of first generation sequencing dominated by Sanger sequencing quickly came to an end with the introduction of the Roche/454 platform in 2005. Followed by Solexa/ Illumina system in 2006. Applied Biosystems SOLiD system in 2007. Human genome sequencing service in 2008. Ion Torrent system in 2010.
These SGSTs generate hundreds of thousands to billions of 25–800 nucleotide-long reads within days in a low-cost manner compared to Sanger sequencing. Following a shotgun approach, several of these technologies today can re-sequence a human genome in 10–30× coverage for under $5,000 in reagent costs in 8–10 days . SGSTs have enjoyed a warm welcome from the academic and industrial scientists.
Applications of SGST’s: SGST applications in drug discovery have primarily been whole genome sequencing. Targeted (e.g. exome) sequencing. Digital gene expression. DNA methylation sequencing and copy-number variation. And have been expanding to chromatin immunoprecipitation sequencing (ChIP-Seq). RNA immunoprecipitation sequencing (RIP-Seq).
Continued whole transcriptome analyses. single cell analyses. nucleic acid structure determination. chromatin conformation (Hi-C) analysis and many others. SGSTs technologies have also been adapted beyond nucleic acid sequencing and, for instance, to protein-DNA affinity measurements ,which may find applications in drug discovery in the future. The details of each of these applications are beyond the scope of this review.
NGS versus Sanger sequencing Sanger Sequencing Preparation of sample is slow. In Sanger Sequencing, it can’t be started directly from a gDNA or cDNA library. It’s a time intensive task. It’s an expensive task. Next Generation Sequencing Preparation of sample for NGS are faster and straightforward. In NGS, it may be started directly from a gDNA or cDNA library. Read more than one billion short reads in single run. A fast and inexpensive method to get accurate genomic information.
Limitations of NGS: Although, NGS is cheaper and faster in comparison to traditional Sanger sequencing but still it is too expensive to be affordable by small labs or an individual. NGS data analysis is time-consuming and needs sufficient knowledge of bioinformatics to harvest accurate information from these sequence data. NGS supports read lengths of small size, which results into highly repetitive sequences. Support for sequencing from short read lengths is one of the major shortcoming which limit its application, especially in de novo sequencing. Data processing steps is another major bottleneck for the implementation and capitalization of NGS technology.
Methodology of Next Generation Sequencing : Basic Principle: Basic principle on which NGS works is similar to traditional Sanger sequencing methods involving capillary electrophoresis . Template preparation The double-stranded DNA is considered as the starting material. However, the source from which this material is derived may vary. Sources can be either genomic DNA, immuno-precipitated DNA, reverse-transcribed RNA or cDNA
2. library preparation: Sequence library preparation involves some common steps of fragmentation, size selection and adapter ligation . Steps serve to break the considered DNA template into smaller, but sequence-able fragments depending upon requisite platform. Moreover, adapter ligation is also involved in this process, which adds platform specific, synthetic DNAs at the end of the DNA fragments present in this library to facilitate the sequencing reactions.
3. library amplification To produce significant signal for nucleotide addition. This step involves either by the attachment of DNA fragment to micro bead or attachment of the same to glass slide, when some PCR techniques are followed.
Overview to NGS
Major platforms in NGS Illumina Sequencing: One of the major platforms used in NGS. In this sequencing technology reads of 100-150bp are considered. Comparatively longer fragments are considered from the template library for ligation.
Ion Torrent PGM (Personal Genome Machine) This sequencing platform does not uses optic signals. They utilize the concept that the addition of a dNTP to a DNA polymer releases an H+ ion. Here template DNA or RNA is fragmented into size of ~200bp. Using emulsion PCR, the amplification takes place. H+ ion released on addition of dNTP to a DNA polymer decreases the pH. The pH changes are detected and recorded from each well, which allows determining the bases type and its concentration in that well.
Pyrosequencing
Methodology Of Pyrosequencing
Applications Of Pyrosequencing:
Whole Genome Sequencing
What is Whole Genome Sequencing? The NCI defines whole-genome sequencing in humans as “a laboratory process that is used to determine nearly all of the approximately 3 billion nucleotides of an individual’s complete DNA sequence, including non-coding sequence.”
Whole-genome sequencing was originally performed for the human genome using Sanger sequencing and took more than a decade and more than $1 billion. Today, we use newer technology referred to as “next-generation sequencing” or “massively parallel sequencing” and also known as “high-throughput sequencing.” These techniques can sequence both DNA and RNA faster and cheaper than traditional Sanger sequencing and, typically, take a few days to perform with costs around $1000.
Whole Genome Sequencing Hierarchical shotgun Approach: Overlapping regions between BAC clones are identified by restriction mapping or STS analysis. Whole genome shotgun approach: DNA is cut randomly into smaller fragments, cloned and then sequenced
Advantages One lab method for all bacteria and all typing needs. Many different analysis-possible to use different approaches depending on organisms and needs. Much more economical and faster. Labor saving because the sequencing reactions is virtually fully automated and the sequences being assembled by computer programs.
Third Generation Sequencing
NEED FOR THIRD GENERATION SEQUENCING Genomes are very complex with many repetitive areas that SGS technologies are incapable to solve them and the relatively short reads made genome assembly more difficult. In second-generation DNA sequencers, the template DNA must be amplified before the sequencing reaction, and an error may occur during the amplification.
Third-generation DNA sequencers use a single DNA molecule as the template without needing to amplify the DNA. These third generations of sequencing have the ability to offer a low sequencing cost and easy sample preparation in an execution time significantly faster than SGS technologies. In addition, TGS are able to produce long reads exceeding several kilo bases for the solution of the assembly problem and repetitive regions of complex genome.
Two Widely Used Platforms In TGS Pacific Biosciences single molecule real time (SMRT) sequencing (2010) Oxford Nanopore Technologies MinION (2014)
Pacific Biosciences single molecule real time (SMRT) sequencing Pacific Biosciences uses the same fluorescent labeling as the other technologies . It detects the signals in real time, as they are emitted when the incorporations occur. It uses a structure composed of many SMRT cells, each cell contains micro fabricated nanostructures called zero mode waveguides (ZMWs) which are wells of tens of nanometers in diameter micro fabricated in a metal film which is in turn deposited onto a glass substrate. These ZMWs exploit the properties of light passing through openings with a diameter less than its wavelength, so light cannot be propagated.
Due to their small diameter, the light intensity decreases along the wells and the bottom of the wells illuminated. Each ZMW contains a DNA polymerase attached to their bottom and the target DNA fragment for sequencing. During the sequencing reaction, the DNA fragment is incorporated by the DNA polymerase with fluorescent labeled nucleotides (with different colors). Whenever a nucleotide is incorporated, it releases a luminous signal that is recorded by sensors. The detection of the labeled nucleotides makes it possible to determine the DNA sequence.
Oxford Nanopore Technologies MinION
In this sequencing technology, the first strand of a DNA molecule is linked by a hairpin to its complementary strand. The DNA fragment is passed through a protein nanopore (a nanopore is a nanoscale hole made of proteins or synthetic materials). When the DNA fragment is translated through the pore by the action of a motor protein attached to the pore, it generates a variation of an ionic current caused by differences in the moving nucleotides occupying the pore. This variation of ionic current is recorded progressively on a graphic model and then interpreted to identify the sequence. The sequencing is made on the direct strand generating the “template read” and then the hairpin structure is read followed by the inverse strand generating the “complement read”, these reads is called "1D". If the “temple” and “complement” reads are combined, then we have a resulting consensus sequence called “two direction read” or "2D“.
Applications of gene sequencing
DNA Forensics: DNA sequencing has been applied in forensic science to identify particular individual because every individual has unique sequence of his/her DNA. It is particularly used to identify the criminals by finding some proof from the crime scene in the form of hair, nail, skin or blood samples.
Agriculture: DNA sequencing has played vital role in the field of agriculture. The mapping and sequencing of whole genome of micro-organisms has allowed the agriculturists to make them useful for the crops and food plants .
Molecular biology: Sequencing is used in molecular biology to study genomes and the proteins they encode. Information obtained using sequencing allows researchers to identify changes in gene, its association with disease and phenotypes, and potential drug targets. Evolutionary biology: Since DNA is an informative macromolecule in terms of transmission from one generation to another. DNA sequencing is used in evolutionary biology to study how different organisms are related and how they evolved. Evolutionary biology:
Metagenomics: It is the study of genetic material recovered directly from environmental samples. The field of metagenomics involves identification of organisms present in a body of water, sewage, dirt and debris filtered from air or swamp samples from organisms. Example : sequencing enables researchers to determine which types of microbes may be present in a microbiome.
Medicine: In medical research, DNA sequencing can be used to detect the genes which are associated with some hereditary diseases . Scientists use different techniques of genetic engineering to identify the detective genes and replace them with healthy one. Cancer : With the help of comparative DNA sequence study, we can detect any mutation.
Epigenetics: The study of changes in organisms caused by modification of gene expression rather than alteration of genetic code itself. Lifestyle, nutrition and environmental factors can lead to epigenetic changes. The mechanisms that produce changes are: DNA methylation Histone modification
References: https://www.ncbi.nlm.nih.gov/pmc/articles/PMC4727787/ https://www.omicsonline.org/open-access/generations-of-sequencing-technologies-from-first-to-next-generation-0974-8369-1000395.php?aid=87862 http://genome.nig.ac.jp/english/glossary_e/third-generation-and-subsequent-types-of-dna-sequencers.html http://blogs.nature.com/naturejobs/2017/10/16/techblog-the-nanopore-toolbox/ Allison LA,2007. Fundamental Molecular Biology, Blackwell publishing, USA. https://arxiv.org/pdf/1606.05254.pdf/