Genetic information is stored in DNA by means of a triplet code that is nearly universal to all living things on Earth.
The genetic code is initially transferred from DNA to RNA, in the process of transcription.
Size: 19.49 MB
Language: en
Added: Apr 28, 2020
Slides: 61 pages
Slide Content
Genetic code and Transcription Preparing by : Anfal ALKATEEB
General information Genetic information is stored in DNA by means of a triplet code that is nearly universal to all living things on Earth. The genetic code is initially transferred from DNA to RNA, in the process of transcription. Once transferred to RNA, the genetic code exists as triplet codons, which are sets of three nucleotides in which each nucleotide is one of the four kinds of ribonucleotides composing RNA. RNA’s four ribonucleotides , analogous to an alphabet of four “letters,” can be arranged into 64 different three-letter sequences. Most of the triplets in RNA encode one of the 20 amino acids present in proteins, which are the end products of most genes. Several codons act as signals that initiate or terminate protein synthesis. In eukaryotes, the process of transcription is similar to, but more complex than, that in prokaryotes and in the bacteriophages that infect them
Gene code In the first step, information present on one of the two strands of DNA ( the template strand) is transferred into an RNA complement through the process of transcription. Once synthesized, this RNA acts as a “messenger” molecule, transporting the coded information out of the nucleus hence its name, messenger RNA (mRNA). The mRNAs then associate with ribosomes , where decoding into proteins occurs.
Genetic code there are two major questions. H ow is genetic information encoded? How does the transfer from DNA to RNA occur, H hus explaining the process of transcription?
The Genetic Code uses ribo nucleotide bases as “letters” general features of genetic code : The genetic code is written in linear form, using as “ letters” the ribonucleotide bases that compose mRNA molecules . The ribonucleotide sequence is derivednfrom the complementary nucleotide bases in DNA. 2. Each “word” within the mRNA consists of three ribonucleotide letters , thus referred to as a triplet code. With several exceptions, each group of three ribonucleotides , called a codon, specifies one amino acid. 3. The code is unambiguous—each triplet specifies only a single amino acid . 4. The code is degenerate, meaning that a given amino acid can be specified by more than one triplet codon. This is the case for 18 of the 20 amino acids. 5. The code contains one “start” and three “stop” signals, triplets that initiate and terminate translation, respectively.
6. No internal punctuation (analogous, for example, to a comma) is used in the code. Thus, the code is said to be comma less. Once translation of mRNA begins, the codons are read one after the other with no breaks between them (until a stop signal is reached). 7. The code is non overlapping. After translation commences, any single ribonucleotide within the mRNA is part of only one triplet. 8. The sequence of codons in a gene is colinear , with the sequence of amino acids making up the encoded protein. 9. The code is nearly universal. With only minor exceptions, a single coding dictionary is used by almost all viruses, prokaryotes, archaea , and eukaryotes.
Triplet Nature of the Code One experimental work of Francis Crick , et al.), provided the first solid evidence for a triplet code These researchers induced insertion and deletion mutations in the rII locus of phage T4 Wild-type phage cause lysis and plaque formation in strains B and K12 of E. coli. However, rII mutations prevent this infection process in strain K12, while allowing infection in strain B. Crick and his colleagues used the acridine dye proflavin to induce the mutations. This mutagenic agent intercalates within the double helix of DNA, causing the insertion or deletion of one or more nucleotides during replication.
The Triplet Nature of the Code An insertion of a single nucleotide causes the reading frame to shift, changing all subsequent downstream codons. Such mutations are referred to as frameshift mutations. the effect of frameshift mutations on a D N A sequence repeating the triplet sequence GAG. (a) the insertion of a single nucleotide shifts all subsequent triplets out of the reading frame. Upon translation, the amino acid sequence of the encoded protein will be altered radically starting at the point where the mutation occurred. A second change might result in a revertant phage, which would display wild-type behavior and successfully infect E. coli K12. For example, if the original mutant contained an insertion (+), a second event causing a deletion (−) close to the insertion would restore the original reading frame.
When two pluses or two minuses occurred in the same sequence, the correct reading frame was not reestablished. This argued against a doublet ( twoletter ) code. However, when three pluses or three minuses were present together, the original frame was reestablished. These observations strongly supported the triplet nature of the code.
The Non Overlapping Nature of the Code Work by Sydney Brenner and others soon established that the code could not be overlapping for any one transcript. If the nucleotide sequence were, say, GTACA, and parts of the central codon, TAC, were shared by the outer codons, GTA and ACA, then only certain amino acids those with codons ending in TA or beginning with AC) should be found adjacent to the one encoded by the central codon. For example, when any particular central amino acid is considered, only 16 combinations of three amino acids ( tripeptide sequences) would theoretically be possible. Looking at the amino acid sequences of proteins that were known at that time, Brenner failed to find such restrictions in tripeptide sequences. For any central amino acid, he found many more than 16 different tripeptides . This observation led him to conclude that the code is not overlapping
A second major argument against an overlapping code concerned the effect of a single-nucleotide change. With an overlapping code, two adjacent amino acids would be affected by such a point mutation. However, mutations in the genes coding for the protein coat of tobacco mosaic virus (TMV), human hemoglobin, and the bacterial enzyme tryptophan synthetase invariably revealed only single amino acid changes. The third argument against an overlapping code was presented by Francis Crick in 1957, when he predicted that DNA would not serve as a direct template for the formation of proteins. Crick reasoned that any affinity between nucleotides and an amino acid would require hydrogen bonding. Chemically, however, such specific affinities seemed unlikely. Crick’s and Brenner’s arguments, taken together, strongly suggested that, during translation, the genetic code is non overlapping . Without exception, this concept has been upheld.
The Comma less and Degenerate Nature of the Code The basis of genetic evidence, that the code would be comma less that is, Crick believed no internal punctuation would occur along the reading frame. Crick also speculated that only 20 of the 64 possible codons would specify an amino acid and that the remaining 44 would carry no coding assignment. Was Crick wrong with respect to the 44 “blank” codes? Or is the code degenerate, meaning that more than one codon specifies the same amino acid?
To answer of this question If such a nonsense codon was encountered during protein synthesis, the process would probably stop or be terminated at that point. If so, the product of the rII locus would not be made, and restoration would not occur. Because the various mutant combinations were able to reproduce on E. coli K12, Crick and his colleagues concluded that, in all likelihood, most, if not all, of the remaining 44 triplets were not blank. It followed that the genetic code was degenerate. As we shall see, this reasoning proved to be correct.
synthesizing polypeptides in a Cell-free system In the cell-free protein-synthesizing system, amino acids are incorporated into polypeptide chains. The process begins with an in vitro mixture containing all the essential factors for protein synthesis in the cell: ribosomes , tRNAs , amino acids, and other molecules essential to translation. In 1961, mRNA had yet to be isolated. However, use of the enzyme polynucleotide phosphorylase allowed the artificial synthesis of RNA templates, which could be added to the cell-free system. This enzyme, isolated from bacteria, catalyzes the reaction shown in figure
Discovered the enzyme functions metabolically in bacterial cells to degrade RNA. However, in vitro, in the presence of high concentrations of ribonucleoside diphosphates , the reaction can be “forced” in the opposite direction, to synthesize RNA, as illustrated in the figure.
Homopolymer Codes RNA molecules containing only one type of ribonucleotide , and used them for synthesizing polypeptides in vitro. In other words, the mRNA they used in their cell-free protein-synthesizing system was either UUUUUU . . . , AAAAAA . . . , CCCCCC . . . , or GGGGGG. . They tested each of these types of mRNA to see which, if any, amino acids were consequently incorporated into newly synthesized proteins.
Note that the specific triplet codon assignments were possible only because homopolymers were used. In this method, only the general nucleotide composition of the template is known, not the specific order of the nucleotides in each triplet, but since three identical letters can have only one possible sequence (e.g., UUU), three of the actual codons for phenylalanine, lysine, and proline could be identified.
Mixed Copolymers : With the initial success of these techniques, researchers turned to the use of RNA heteropolymers. In their next experiments, two or more different ribonucleoside diphosphates were used in combination to form the artificial message. Suppose that only A and C are used for synthesizing the mRNA, in a ratio of 1A:5C. The insertion of a ribonucleotide at any position along the RNA molecule during its synthesis is determined by the ratio of A:C.. Therefore, there is a 1/6 possibility for an A and a 5/6 chance for a C to occupy each position. On this basis, we can calculate the frequency of any given triplet appearing in the message.
By examining the percentages of the different amino acids incorporated into the polypeptide synthesized under the direction of this message, we can propose probable base compositions for each of those amino acids. Because proline appears 69 percent of the time, we could propose that proline is encoded by CCC (57.9 percent) and also by one of the codons consisting of 2C:1A (11.6 percent).
The Triplet-Binding Assay In one technique took advantage of the observation that ribosomes , when presented in vitro with an RNA sequence as short as three ribonucleotides , will bind to it and form a complex similar to what is found in vivo. The triplet RNA sequence acts like a codon in mRNA, attracting a tRNA molecule containing a complementary sequence. Such a triplet sequence in tRNA , that is, complementary to a codon of mRNA, is known as an anticodon .
This Table 13.2 shows 26 triplet codons assigned to ten different amino acids. Eventually, about 50 of the 64 triplets were assigned. These specific assignments led to two major conclusions. First, the genetic code is degenerate that is, one amino acid may be specified by more than one triplet . Second, the code is also unambiguous that is, a single codon specifies only one amino acid . The triplet-binding technique was a major innovation in the effort to decipher the genetic code.
Repeating Copolymers a dinucleotide made in this way is converted to an mRNA with two repeating triplet codons. A trinucleotide is converted into an mRNA that can be read as containing one of three potential repeating triplets, depending on the point at which initiation occurs. Finally, a tetranucleotide creates a message with four repeating triplet sequences.
The Coding dictionary reveals Several Interesting Patterns among the64 Codons The various techniques applied to decipher the genetic code have yielded a dictionary of 61 triplet codons assigned to amino acids. The remaining three codons are termination signals, not specifying any amino acid. NOTE : The various techniques applied to decipher the genetic code have yielded a dictionary of 61 triplet codons assigned to amino acids. The remaining three codons are termination signals, not specifying any amino acid.
Degeneracy and the wobble Hypothesis all amino acids are specified by two, three, or four different codons. 3 amino acids (serine, arginine ,& leucine ) are each encoded by 6 different codons. Only tryptophan and methionine are encoded by single codons. Most often sets of codons specifying the same amino acid are grouped, such that the first two letters are the same, with only the third differing. Crick’s hypothesis predicted that the initial two ribonucleotides of triplet codes are often more critical than the third member in attracting the correct tRNA .
He postulated that hydrogen bonding at the third position of the codon– anticodon interaction would be less spatially constrained and need not adhere as strictly to the established base-pairing rules. Consistent with the wobble hypothesis and the degeneracy of the code, U at the first position (the 5′ end) of the tRNA anticodon may pair with A or G at the third position (the 3′ end) of the mRNA codon, and G may likewise pair with U or C. Inosine (I), one of the modified bases found in tRNA may pair with C, U, or A. As a result of these wobble rules, only about 30 different tRNA species are necessary to accommodate the 61 codons specifying an amino acid.
The wobble hypothesis proposes a more flexible set of base-pairing rules at the third position of the codon. Current estimates are that 30 to 40 tRNA species are present in bacteria and up to 50 tRNA species in animal and plant cells.
The Ordered Nature of the Code ordered genetic code. Chemically similar amino acids often share one or two “middle” bases in the different triplets encoding them. For example, either U or C is often present in the second position of triplets that specify hydrophobic amino acids, including valine and alanine , among others. Two codons (AAA and AAG) specify the positively charged amino acid lysine. If only the middle letter of these codons is changed from A to G (AGA and AGG), the positively charged amino acid arginine is specified. Hydrophilic amino acids, such as serine or threonine are specified by triplet codons with G or C in the second position. The end result of an “ordered” code is that it buffers the potential effect of mutation on protein function. While many mutations of the second base of triplet codons result in a change of one amino acid to another, the change is often to an amino acid with similar chemical properties. In such cases, protein function may not be noticeably altered.
Initiation, Termination, and suppression Initiation of protein synthesis in vivo is a highly specific process. In bacteria, the initial amino acid inserted into all polypeptide chains is a modified form of methionine —N- formylmethionine ( fmet ) Only one codon, AUG, codes for methionine , and it is sometimes called the initiator codon. In bacteria, either the formyl group is removed from the initial methionine upon completion of synthesis of a protein, or the entire formylmethionine residue is removed. In eukaryotes, unformylated methionine is the initial amino acid of polypeptide synthesis. As in bacteria, this initial methionine residue may be cleared from the polypeptide.
As mentioned in the preceding section, three other codons (UAG, UAA, and UGA) serve as termination codons, punctuation signals that do not code for any amino acids . They are not recognized by a tRNA molecule, and translation terminates when they are encountered. Mutations that produce any of the three codons internally in a gene also result in termination. In that case, only a partial polypeptide is synthesized, since it is prematurely released from the ribosome. When such a change occurs in the DNA, it is called a nonsense mutation .
The Genetic Code has been Confirmed in Studies of Phage MS2 MS2 is a bacteriophage that infects E. coli. Its nucleic acid (RNA) contains only about 3500 ribonucleotides , making up only three genes. These genes specify a coat protein, an RNA-directed replicase , and a maturation protein (the A protein.) The linear sequence of triplet codons formed by the nucleotides corresponds precisely with the linear sequence of amino acids in the protein. Furthermore, the codon for the first amino acid is AUG, the common initiator codon; the codon for the last amino acid is followed by two consecutive termination codons, UAA and UAG. The analysis clearly of MS2 genes, showed that the genetic code in this virus was identical to that established in bacterial systems. Other evidence suggests that the code is also identical in eukaryotes, thus providing confirmation of what seemed to be a universal code.
The Genetic Code Is nearly universal The genetic code would be found to be universal, applying equally to viruses, bacteria, archaea , and eukaryotes. Certainly, the nature of mRNA and the translation machinery seemed to be very similar in these organisms. THe coding properties of DNA derived from mitochondria ( mtDNA ) of yeast and humans undermined the hypothesis of the universality of the genetic language. Since then, mtDNA has been examined in many other organisms. Cloned mtDNA fragments were sequenced and compare with the amino acid sequences of various mitochondrial proteins, revealing several exceptions to the coding dictionary.
Most surprising was that the codon UGA, normally specifying termination, specifies the insertion of tryptophan during translation in yeast and human mitochondria. In human mitochondria , AUA, which normally specifies isoleucine , directs the internal insertion of methionine . In yeast mitochondria , threonine is inserted instead of leucine when CUA is encountered in mRNA.
different Initiation Points Create overlapping Genes Earlier we stated that the genetic code is nonoverlapping , meaning that each ribonucleotide in the code for a given polypeptide is part of only one codon. This concept of overlapping genes is illustrated in (a). Note that gene is also referred to as an ORF, or open reading frame, which is defined as a DNA sequence that produces an RNA that has a start and stop codon, between which is a series of triplet codons specifying the amino acids making up a polypeptide. Thus, we sometimes also refer to overlapping ORFs. Translation initiated at two different AUG positions out of frame with one another will give rise to two distinct amino acid sequences.
The sequences specifying the K and B polypeptides are initiated in separate reading frames within the sequence specifying the A polypeptide. The K gene sequence overlaps into the adjacent sequence specifying the C polypeptide. The E sequence is out of frame with, but initiated within, that of the D polypeptide. Finally, the A′ sequence, while in frame with the A sequence, is initiated in the middle of the A sequence. They both terminate at the identical point. In all, seven different polypeptides are created from a DNA sequence that might otherwise have specified only three (A, C, and D).
A single mutation in the middle of the B gene could potentially affect three other proteins (the A, A′, and K proteins). It may be for this reason that, while present, overlapping genes are not the rule in other organisms. More recently, genomic analysis has made it possible to examine extensive DNA sequence information in eukaryotes, revealing that the concept of overlapping genes also applies to these organisms, including mammals.
Transcription Synthesizes rnA on DNA template From all studies that it was quite clear that proteins were the end products of many genes. The complex, multistep process begins with the transfer of genetic information stored in DNA to RNA. The central question was how DNA, a nucleic acid, is able to specify a protein composed of amino acids,?
The process by which RNA molecules are synthesized on a DNA template is called transcription. The process by which RNA molecules are synthesized on a DNA template is called transcription.. It results in an mRNA molecule complementary to the gene sequence of one of the two strands of the double helix. Each triplet codon in the mRNA is, in turn, complementary to the anticodon region of its corresponding tRNA , which inserts the correct amino acid into the polypeptide chain during translation. The significance of transcription is enormous, for it is the initial step in the process of information flow within the cell.
1. DNA is, for the most part, associated with chromosomes in the nucleus of the eukaryotic cell. However, protein synthesis occurs in association with ribosomes located outside the nucleus, in the cytoplasm. Therefore DNA does not appear to participate directly in protein synthesis. 2. RNA is synthesized in the nucleus of eukaryotic cells, in which DNA is found, and is chemically similar to DNA. 3. Following its synthesis, most RNA migrates to the cytoplasm, in which protein synthesis (translation) occurs. 4. The amount of RNA is generally proportional to the amount of protein in a cell.
Initiation, Elongation, and Termination of RNA synthesis 1. Once RNA polymerase has recognized and bound to the promoter, DNA is converted from its double-stranded form to an open structure, exposing the template strand. 2. The enzyme then proceeds to initiate RNA synthesis, whereby the first 5′-ribonucleoside triphosphate , which is complementary to the first nucleotide, is inserted at the start site.
Subsequent ribonucleotide complements are inserted and linked together by phosphodiester bonds as RNA polymerization proceeds. This process continues in a 5′ to 3′ direction (in terms of the nascent RNA), creating a temporary 8-bp DNA/RNA duplex whose chains run antiparallel to one another
In E. coli, this process proceeds at the rate of about 50 nucleotides/second at 37°C. Like DNA polymerase, RNA polymerase can perform proofreading as it adds each nucleotide. It is capable of recognizing a mismatch where a non complementary base has been inserted. In such a case, the enzyme backs up and removes the mismatch. It then reverses direction and continues elongation
The enzyme traverses the entire gene until eventually it encounters a specific nucleotide sequence that acts as a termination signal. Such termination sequences, about 40 base pairs in length, are extremely important in prokaryotes because of the close proximity of the end of one gene to the upstream sequences of the adjacent gene. The unique sequence of nucleotides in this termination region causes the newly formed transcript to fold back on itself, forming what is called a hairpin secondary structure, held together by hydrogen bonds The hairpin is important to termination. In some cases, the termination of synthesis is also dependent on the termination factor, rho (R).
Wherever an A, T, C, or G residue was encountered there, a corresponding U, A, G, or C residue, respectively, was incorporated into the RNA molecule. These RNA molecules ultimately provide the information leading to the synthesis of all proteins in the cell. The result is that during transcription, a large mRNA is produced, encoding more than one protein. Since genes in bacteriophages are sometimes referred to as complementation groups and are called cistrons the RNA is called a polycistronic mRNA. The products of genes transcribed in this fashion are usually all needed by the cell at the same time, so this is an efficient way to transcribe and subsequently translate the needed genetic information. In eukaryotes, monocistronic mRNAs are the rule, although an increasing number of exceptions are being reported.
Transcription in eukaryotes differs from Prokaryotic transcription in Several ways Transcription in eukaryotes occurs within the nucleus under the direction of three separate forms of RNA polymerase. Unlike the prokaryotic process, in eukaryotes the RNA transcript is not free to associate with ribosomes prior to the completion of transcription. For the mRNA to be translated, it must move out of the nucleus into the cytoplasm.
2. Initiation of transcription of eukaryotic genes requires the compact chromatin fiber, characterized by nucleosome coiling, to be uncoiled and the DNA to be made accessible to RNA polymerase and other regulatory proteins. This transition, referred to as chromatin remodeling, reflects the dynamics involved in the conformational change that occurs as the DNA helix is opened 3. Initiation and regulation of transcription entail a more extensive interaction between cis -acting DNA sequences and trans-acting protein factors involved in stimulating and initiating transcription. Eukaryotic RNA polymerases for example, rely on transcription factors (TFs) to scan and bind to DNA. In addition to promoters, other control units, called enhancers and silencers, may be located in the 5′ regulatory region upstream from the initiation point, but they have also been found within the gene or even in the 3′ downstream region, beyond the coding sequence.
An initial processing step involves the addition of a 5′ cap and a 3′ tail to most transcripts destined to become mRNAs. The initial (or primary) transcripts are most often much larger than those that are eventually translated into protein. Sometimes called pre-mRNAs, these primary transcripts are found only in the nucleus and referred to collectively as heterogeneous nuclear RNA ( hnRNA ). Only about 25 percent of hnRNA molecules are converted to mRNA.
Initiation of Transcription in Eukaryotes Eukaryotic RNA polymerase exists in three distinct forms. While the three forms of the enzyme share certain polypeptide subunits, each nevertheless transcribes different types of genes, Each enzyme is larger and more complex than the single prokaryotic polymerase. For example, in yeast, the holoenzyme consists of two large subunits and 10 smaller subunits
In regard to the initial template-binding step and promoter regions, most is known about RNA polymerase II (RNAP II), which is responsible for the transcription of a wide range of genes in eukaryotes The activity of RNAP II is dependent on both::::: cis -acting elements surrounding the gene itself and a number of trans-acting transcription factors that bind to these DNA elements
At least four cis -acting DNA elements regulate the initiation of transcription by RNAP II. : The first of these elements , the core promoter, determines where RNAP II binds to the DNA and where it begins copying the DNA into RNA. The other three types of regulatory DNA sequences, called proximal-promoter elements, enhancers, and silencers, influence the efficiency or the rate of transcription initiation by RNAP II from the core-promoter element. Recall that in prokaryotes, the DNA sequence recognized by RNA polymerase is also called the promoter
In eukaryotes, however, transcriptional initiation is controlled by regulatory proteins that bind to a larger number of cis -acting DNA elements Although eukaryotic promoter elements can determine the site and general efficiency of initiation, other elements, known as enhancers and silencers, have more dramatic effects on eukaryotic gene transcription
recent Discoveries Concerning RNA polymerase function The enzyme is confronted with its substrate DNA that, unlike in bacteria, is wrapped around histone proteins organized into nucleosomes . RNAP II in yeast contains two large subunits and ten smaller ones, forming a huge three-dimensional complex with a molecular weight of about 500 kDa . Promoter DNA initially positions itself over a cleft formed between the two large subunits of RNAP II. Complementary RNA synthesis is then initiated on the DNA template strand In summery , an unstable complex is formed during the initiation of transcription, stability is established n once elongation manages to create a duplex of sufficient size, elongation proceeds, and then instability again characterizes termination of transcription
Processing Eukaryotic RNA: Caps and Tails While in bacteria the base sequence of DNA is transcribed into an mRNA that is immediately and directly translated into the amino acid sequence as dictated by the genetic code . eukaryotic RNA transcripts require significant alteration before they are transported to the cytoplasm and translated. All experimental showed that eukaryotic mRNA is transcribed initially as a precursor molecule much larger than that which is translated into protein. T his initial transcript of a gene must be processed in the nucleus before it appears in the cytoplasm as a mature mRNA molecule.
Modification to Eukaryote pre- mRNA. A 7- methylguanosine cap is added to the 5- end of the primary transcript by a 5- __5- phosphate linkage. ( stability and protection) A poly A tail ( 20-200) nucleotide poly adenosine tract, (As) is added to 3- end of the transcript. The 3- end is generated by cleavage rather than by termination ( stability and protection)
The cap stabilizes the mRNA by protecting the 5′ end of the molecule from nuclease attack. I t is thought to simplify the transport of mature mRNAs across the nuclear membrane into the cytoplasm and in the initiation of translation of the mRNA into protein. T hat both pre-RNAs and mRNAs contain at their 3′ end a stretch of as many as 250 a denylic acid residues. Poly A has now been found at the 3′ end of almost all mRNAs studied in a variety of eukaryotic organisms. In fact, poly-A tails have also been detected in some prokaryotic mRNAs What the function of this addtions
While the AAUAAA sequence is not found on all eukaryotic transcripts, it appears to be essential to those that have it. If the sequence is changed as a result of a mutation, those transcripts that would normally have it cannot add the poly-A tail. In the absence of this tail, these RNA transcripts are rapidly degraded. Both the 5′ cap and the 3′ poly-A tail are critical if an mRNA transcript is to be transported to the cytoplasm and translated.
the Coding regions of eukaryotic Genes Are Interrupted by Intervening Sequences Called Introns The researchers are presented direct evidence that the genes of animal viruses contain internal nucleotide sequences that are not expressed in the amino acid sequence of the proteins they encode. These internal DNA sequences are represented in initial RNA transcripts , but they are removed before the mature mRNA is translated.
Such nucleotide segments are called intervening sequences. DNA sequences that are not represented in the final mRNA product are also called introns (“ int ” for intervening), and those retained and expressed are called exons (“ex” for expressed). Splicing involves the removal of the corresponding ribonucleotide sequences representing introns as a result of an excision process and the rejoining of the regions representing exons
In eukaryotic genes, there are two approaches have been most fruitful for this purpose.:: 1. involves the molecular hybridization of purified, functionally mature mRNAs with DNA containing the genes from which the RNA was originally transcribed. Hybridization between nucleic acids that are not perfectly complementary results in heteroduplexes , in which introns present in the DNA but absent in the mRNA loop out and remain unpaired . Intervening sequences in various eukaryotic genes . the numbers indicate the number of nucleotides present in various intron and exon regions The chicken ovalbumin complex shown in the figure is a heteroduplex with 7 loops (A through G), representing seven introns whose sequences are present in the DNA but not in the final mRNA.
2. : provides more specific information. It involves a direct comparison of nucleotide sequences of DNA with those of mRNA and their correlation with amino acid sequences.Such an approach allows the precise identification of all intervening sequences. Thus far, most eukaryotic genes have been shown to contain introns. Philip & Richard were studied the B- globin gene in mice and rabbits 1 . The mouse gene contains an intron 550 nucleotides long, beginning immediately after the codon specifying the 104th amino acid.
In the rabbit, there is an intron of 580 base pairs near the codon for the 110th amino acid . In addition, another intron of about 120 nucleotides exists earlier in both genes . Similar introns have been found in the b- globin gene in all mammals examined..