the general laboratory technique for determining the exact sequence of nucleotides, or bases, in a DNA molecule.
merzifarooq
27 views
48 slides
Jul 02, 2024
Slide 1 of 48
1
2
3
4
5
6
7
8
9
10
11
12
13
14
15
16
17
18
19
20
21
22
23
24
25
26
27
28
29
30
31
32
33
34
35
36
37
38
39
40
41
42
43
44
45
46
47
48
About This Presentation
the general laboratory technique for determining the exact sequence of nucleotides, or bases, in a DNA molecule.
Size: 1.13 MB
Language: en
Added: Jul 02, 2024
Slides: 48 pages
Slide Content
DNA Sequencing By: Abdur rehman Lecturer IPMS KMU
Introduction Probably the most important technique available to the molecular biologist is DNA sequencing , by which the precise order of nucleotides in a piece of DNA can be determined. DNA sequencing methods have been around for 50 years, and since the mid- 1970s rapid and efficient sequencing has been possible . The techniques in use today can be divided into two categories The chain-termination method, which was devised by Fred Sanger and colleagues in the mid-1970s, and has been extensively used ever since . Next-generation sequencing, which is a collection of different methods, all of which involve a massively parallel strategy that enables millions of sequences to be generated at the same time.
Chain-termination DNA sequencing Chain-termination DNA sequencing is based on the principle that single-stranded DNA molecules that differ in length by just a single nucleotide can be separated from one another by polyacrylamide gel electrophoresis . If the electrophoresis is carried out in a capillary system, then it is possible to resolve a family of molecules representing all lengths up to 1500 nucleotides, with the single-stranded molecules emerging one after another from the end of the capillary (Fig, 10.1)
Chain-termination sequencing is carried out with a DNA polymerase, which makes copies of the DNA molecule being sequenced. To start the sequencing experiment,a short oligonucleotide is annealed to the template DNA, this oligonucleotide subsequently acting as the primer for synthesis of a new DNA strand that is complementary to the template (Figure 10.2a).
The strand synthesis reaction, which requires the four deoxyribonucleotide triphosphates ( dNTPs – dATP , dCTP , dGTP , and dTTP ) as substrates, would normally continue until several thousand nucleotides had been polymerized. This does not occur in a chain-termination sequencing experiment because, as well as the four deoxynucleotides,a small amount of each of four dideoxynucleotides ( ddNTPs – ddATP , ddCTP , ddGTP ,) and ddTTP ) is added to the reaction. Each of these dideoxynucleotides is labelled with a different fluorescent marker .
The polymerase enzyme does not discriminate between deoxy - and dideoxynucleotides , but once incorporated a dideoxynucleotide blocks further elongation because it lacks the 3′-hydroxyl group needed to form a connection with the next nucleotide
Because the normal deoxynucleotides are also present, but in larger amounts than the dideoxynucleotides , the strand synthesis does not always terminate close to the primer , and several hundred nucleotides may be polymerized before a dideoxynucleotide is eventually incorporated . The result is a set of new molecules, all of different lengths,and each ending in a dideoxynucleotide whose identity indicates the nucleotide – A, C, G , or T – that is present at the equivalent position in the template DNA (Figure 10.2c).
To work out the DNA sequence, all that we have to do is identify the dideoxynucleotide at the end of each chain-terminated molecule. This is where the polyacrylamide gel comes into play. The mixture is loaded into the capillary gel and electrophoresis carried out to separate the molecules according to their lengths. After separation , the molecules are run past a fluorescence detector capable of discriminating the labels attached to the dideoxynucleotides
Not all DNA polymerases can be used for sequencing Many DNA polymerases have a mixed enzymatic activity, being able to degrade as well as synthesize DNA Degradation can occur in either the 5′→3′ or 3′→5′ direction (Figure 10.4), but it is the latter that is detrimental to chain-termination sequencing. This is because the 3′→5′ activity prevents chain termination from occurring, by removing the dideoxynucleotide immediately after it has been added at the 3′ end of the strand that is being synthesized
In the original method for chain-termination sequencing, the Klenow polymerase was used as the sequencing enzyme. The enzyme used for sequencing was further modified by mutation to remove the 3′→5′ exonuclease activity. However, the Klenow polymerase has low processivity, which means that it can only synthesize a relatively short DNA strand before dissociating from the template due to natural causes This limits the length of sequence that can be obtained from a single experiment to about 250 bp . To avoid this problem, most sequencing today makes use of the Taq DNA polymerase, which has high processivity and no exonuclease activity and so is ideal for chain-termination sequencing, enabling sequences of 750 bp and longer to be obtained in a single experiment.
Chain-termination sequencing with Taq polymerase This chain-termination method that uses Taq polymerase is called thermal cycle sequencing. It is carried out in a similar way to PCR, but just one primer is used and the reaction mixture includes the four dideoxynucleotides (Figure 10.5). Because there is only one primer, only one of the strands of the starting molecule is copied and the product accumulates in a linear fashion, not exponentially as is the case in a real PCR. The presence of the dideoxynucleotides in the reaction mixture causes chain termination, and the family of resulting strands is then separated by polyacrylamide gel electrophoresis
If a PCR product is being sequenced, then one of the primers from the original PCR can be used in the sequencing reaction. If two reactions are carried out, one with each of the two PCR primers, then forward and reverse sequences are obtained (Figure 10.6a). This is an advantage if the PCR product is more than 750 bp and hence too long to be sequenced completely in one experiment.Alternatively , it is possible to extend the sequence in one direction by synthesizing a new primer, designed to anneal at a position within the PCR product (Figure 10.6b).
If cloned DNA is being sequenced, then a universal primer can be used, this being a primer that is complementary to the part of the vector DNA immediately adjacent to the point into which new DNA is ligated (Figure 10.7) The 3′ end of the primer points toward the inserted DNA, so the sequence that is obtained starts with a short stretch of the vector and then progresses into the cloned DNA fragment
Limitations of chain-termination sequencing With the most up-to-date versions of the chain-termination method it is possible to obtain over 750 bp of sequence per experiment Chain-termination sequencing is therefore the method of choice for sequencing genes and other DNA fragments obtained by cloning or PCR The chain-termination method was also used to obtain the first complete genome sequences. This was a much more challenging task, because even the smallest bacterial genomes are over 1Mb in length, and the human genome, which initially was sequenced by the chain-termination method, is 3200 Mb (Table 10.1).
Furthermore, as no sequencing method is entirely accurate it is necessary to sequence each region of a genome multiple times, in order to identify errors present in individual sequence reads (Figure 10.8). With the chain-termination method, to ensure that errors are identified, at least a fivefold sequence depth or coverage is required, which means that every nucleotide is present in five different reads.
So, for the human genome, a total of 5 × 3200 = 16 000 Mb of sequence would be required. This is equivalent to over 21 million chain-termination sequences averaging 750 bp in length. When the human genome was sequenced during the 1990s and early 2000s, chain-termination sequencing was the only method available, so this challenge had to be met. In fact it was exceeded, as the Human Genome Project had generated 23 147 Mb of sequence by the time that the first draft of the genome was published in 2001. This capacity was achieved by automated sequencing machines, capable of generating 384 sequences in parallel in a single run, each run taking about one hour, corresponding to an output of almost 7 Mb per day under optimal working conditions.
In order for personalized medicine to become a reality, cost-effective and rapid ways of sequencing individual genomes are needed. It was this requirement that stimulated the development of next-generation sequencing methods, aimed at generating increasingly larger amounts of sequence data not only more quickly but also at less cost.
Next-generation sequencing With a next-generation method, a large library made up of thousands or millions of DNA fragments is sequenced in a single experiment (Figure 10.9). Often, these fragments represent an entire genome, and in most projects the sequencing experiment is preceded only by DNA extraction.
Preparation of a next-generation sequencing library The preparation of a library of DNA fragments that have been immobilized on a solid support, in such a way that the individual sequencing reactions can be carried out side-by-side in an array format (Figure 10.10) There are three steps to the preparation of this library: 1.Breakage of the starting DNA into fragments of sizes that are suitable for the sequencing method being used. 2. Immobilization of the fragments on to the solid support. 3. Amplification of the immobilized fragments.
DNA fragmentation: Library preparation begins with purification of the DNA to be sequenced, The standard fragmentationmethod is sonication, in which ultrasound is used to cause breaks in the DNA molecules (Figure 10.11). The sonicated DNA is then fractionated by agarose gel electrophoresis, and fragments of the desired size purified from the gel Sonication is the preferred fragmentation method because it causes breaks at random positions in DNA molecules If a restriction endonuclease were used to break up the DNA, some fragments might be too long to be sequenced entirely from their ends With next-generation methods it is not possible to direct the sequencing towards the middle of a fragment, as can be done by designing an internal primer for the chain-termination method.
Immobilization of the library: The DNA fragments that make up the library cannot be directly immobilized on to a solid support. First, adaptors must be attached to the ends of the fragments. The exact role of these adaptors in library immobilization depends on the particular sequencing method being used. Two examples are as follows:
In the first method, the solid support is a glass slide that has been coated with many copies of a short oligonucleotide (Figure 10.12). The sequence of this oligonucleotide matches the ends of the adaptors. The fragments are therefore denatured into single-stranded DNA. The single-stranded molecules then attach to the glass slide by base pairing between their adaptor sequences and the immobilized oligonucleotides
In the second method , the solid support is provided by small metallic beads that are coated with streptavidin . One of the adaptors has a biotin label attached to its 5′ end. DNA fragments therefore become attached to the beads by biotin– streptavidin linkages (Figure 10.13a). The ratio of DNA fragments to beads is set so that, on average, just one fragment becomes attached to each bead. The beads are then shaken in an oil–water mixture, with the conditions again carefully controlled so that there is just one bead in each aqueous droplet within the emulsion (Figure 10.13b). These aqueous droplets are transferred into wells on a plastic strip, one droplet per well, with thousands of wells in total.
Amplification of the library: the final step in library preparation is amplification of the immobilized fragments by PCR, to produce a sufficient number of identical copies to be sequenced. The adaptors now play a second role as they provide the annealing sites for the primers for this PCR. With the glass slide method, the PCR products become attached to oligonucleotides adjacent to the template fragment, with each template being converted into a cluster of identical fragments (Figure 10.14a). If metallic beads are used the PCR is carried out in the oil emulsion, with the products of each PCR being retained within their own water droplet, prior to deposition of the droplets into their wells (Figure 10.14b).
Next-generation sequencing methods Once the fragment library has been immobilized on to its solid support, the actual phase of the next-generation project can begin. Reversible terminator sequencing, which is usually performed with fragment clusters on a slide support. Pyrosequencing, which is usually carried out with fragments immobilized on metallic beads.
Reversible terminator sequencing This method, like chain-termination sequencing, makes use of modified nucleotides which block further strand synthesis once one has been incorporated at the end of the growing polynucleotide. There is, however, a critical difference. As the name ‘reversible terminator’ implies, the termination step is reversible because the chemical group that has been attached at the 3′ position of the terminator nucleotide can be removed, converting the position to a 3′-hydoxyl group, which allows further extension to occur (Figure 10.15a). The terminator nucleotides are labelled , each with a different fluorophore , so which of the four nucleotides is added at each step can be identified. Once the nucleotide has been identified, the 3′ blocking group, as well as the fluorophore , are detached, leaving a standard nucleotide (Figure 10.15b).
The next step in the strand synthesis reaction can now take place, enabling the next nucleotide in the sequence to be identified. The reversible terminator method with libraries immobilized on a slide support generates relatively short sequence reads, with a maximum length of 300 bp , but is so massively parallel that up to 2000 Mb of sequence can be obtained per run. This technology is usually referred to as Illumina sequencing
Pyrosequencing Pyrosequencing takes a radically different approach, The individual nucleotides are not labelled ; instead, the sequence is read by detecting flashes of chemiluminescence . The chemiluminescence is generated by the enzyme sulphurylase from the molecule of pyrophosphate released each time DNA polymerase adds a deoxynucleotide to the 3′ end of the growing strand if all four deoxynucleotides were to be added at once, the flashes of light would be seen continually and no useful sequence information would be obtained. Consequently, each deoxynucleotide is added separately, one after the other, with a nucleotidase enzyme also present in the reaction mixture. In this way, if a deoxynucleotide is not incorporated into the polynucleotide it will be rapidly degraded before the next one is added (Figure 10.16)
This method termed 454 sequencing, which is named after the company that initially developed this particular technology. In its most advanced format, 454 sequencing can give sequence reads up to 1000 bp in length, with enough reads to generate a total of 700 Mb of DNA sequence per run
Third generation sequencing Method of pyrosequencing and reversible termination cause delay because in reversible the delay is caused by the need to remove the 3′ blocking group and the fluorophore after each nucleotide addition while in pyrosequencing each nucleotide is presented individually to the polymerase. Third-generation sequencing aims to improve on the next-generation methods by carrying out sequencing in real time. One of the most promising third-generation methods is single molecule real-time sequencing . A sophisticated optical system called a zero-mode waveguide is used to observe the copying of a single DNA template, with each nucleotide addition identified from its attached fluorescent label (Figure 10.17). Because the optical system is so precise, there is no need to block the 3′ carbon of the nucleotide Read lengths of up to 20 000 bp have been reported with this method.