BBAU LUCKNOW A Presentation On By PRASHANT TRIPATHI (M.Sc. IM) BBAU SEQUENCE ANALYSIS
Defining Sequence Analysis S equence A nalysis is the process of subjecting a DNA, RNA or peptide sequence to any of a wide range of analytical methods to understand its features, function, structure, or evolution. It includes- Sequencing : Sequence Assembly ANALYSIS Alignment : Searching (in Databases)
APPLICATIONS (Uses) The comparison of sequences in order to find similarity, often to infer if they are related (homologous). Identification of intrinsic features of the sequence such as active sites, post translational modification sites, gene-structures, distributions of introns and exons . Identification of sequence differences and variations such as point mutations and single nucleotide polymorphism (SNP) in order to get the genetic marker. Revealing the evolution and genetic diversity of sequences and organisms Identification of molecular structure from sequence alone. Genetic diseases
DNA SEQUENCING DNA sequencing is the process of determining the precise order of nucleotides or order of the four bases—adenine, guanine, cytosine, and thymine, in a strand of DNA . Expressed S equence T ag (EST) is a short sub-sequence of a cDNA sequence. ESTs may be used to identify gene transcripts, and are instrumental in gene discovery and in gene-sequence determination. Methods: Sanger Sequencing or Chain Termination Method Pyrosequencing Shotgun Sequencing method
Fl uo re scence-Based S equencing SEPERATED NUCLEOTIDE SEQUENCES INDIVIDUAL PEAKS CHROMATOGRAPH
Protein Sequencing Protein sequencing is a technique to determine the amino acid sequence of a protein, as well as which conformation the protein adopts and the extent to which it is complexed with any non-peptide molecules. Methods: Edman Degradation Mass Spectroscopy
Mass spectrometry (MS) is an analytical technique that ionizes chemical species and sorts the ions based on their mass to charge ratio.
Sequence Assembly Sequence assembly refers to the reconstruction of a DNA sequence by aligning and merging small DNA fragments. It is an integral part of modern DNA sequencing. cutting the DNA into small pieces , reading the small fragments, reconstituting the original DNA by merging the information on various fragment .
Sequence Alignment S equence A lignment is a way of arranging the sequences of DNA, RNA, or protein to identify regions of similarity that may be a consequence of functional, structural, or evolutionary relationships between the sequences. It involves the identification of the correct location of deletions and insertions that have occurred in either of the two lineages since the divergence from a common ancestor.
TYPES On the basis of number of comparing sequencing strand, it is of two types: Pair-wise Alignment Multiple Sequence Alignment
Pair-wise Sequence Alignment Pair-wise sequence alignment only compares two sequences at a time. Optimality is based on SCORE. A pairwise alignment consists of a series of paired bases, one base from each sequence. There are three types of pairs: (1) matches = the same nucleotide appears in both sequences. (2) mismatches = different nucleotides are found in the two sequences. (3) gaps = a base in one sequence and a null base in the other.
Algorithm used are Needleman- Wunsch algorithm and the Smith-Waterman algorithm BLAST ( B asic L ocal A lignment S earch T ooL )
Multiple Sequence Alignment M ultiple sequence alignment (MSA) is a sequence alignment of three or more biological sequences, generally protein, DNA, or RNA. ClustalW , PROBCONS, MUSCLE
DATABASES Biological databases are libraries of life sciences information, collected from scientific experiments, published literature, high-throughput experiment technology, and computational analysis. UniGene is an NCBI database of the TRANCRIPTOME DNA Data Bank of Japan ( DDBJ) European Bioinformatics Institute (EMBL-EBI) SWISS-PROT & Tr -EMBL