protein sequence analysis

KVA DAV COLLEGE FOR WOMEN KARNAL DEPARTMENT OF BIOTECHNOLOGY Presented by- Ramika MSC Biotechnology 1 st year 2399620006 PROTEIN SEQUENCE ANALYSIS

CONTENTS Introduction History Prepare the proteins for sequencing Sequencing methods N-terminal sequencing C-terminal sequencing DNA sequencing Protein mass spectrometry Bioinformatics tools

INTRODUCTION Protein: Polymer of amino acids Protein structure and function depends upon amino acid sequence. Protein Sequencing : Technique to find out amino acid sequences in protein. Imp for understanding cellular functions. Imp in targeting drugs to specific metabolic pathways.

HISTORY 1951 : The very first sequence of insulin protein were characterized by Fred Sanger. The method used in this study , which is called “SANGER METHOD” was a milestone in sequencing long strand molecule such as DNA. This method was eventually used in human genome project . 1969: Analysis of sequence of tRNA were used to infer residues interactions from corelated changes in nucleotide sequence, giving rise to tRNA secondary structure . 1970: Saul B.Needleman and Christain D.Wunsh published the first computer algorithm for aligning two sequences. 1977: Publication of first complete genome of bacteriophage.

Prepare the proteins for sequencing If the protein contains more than one polypeptide chain, the chains are separated and purified. Intrachain S--S (disulfide) cross-bridges between cysteine residues in the polypeptide chain are cleaved. If these disulfides are interchain linkages, then step 2 precedes step1. The amino acid composition of each polypeptide chain is determined. The N-terminal and C-terminal residues are identified. Each polypeptide chain is cleaved into smaller fragments. Sequence determination of peptide fragments. The overall amino acid sequence of the protein is reconstructed from the sequences in overlapping fragments. The positions of S--S cross-bridges formed between cysteine residues are located.

Separation of Polypeptide Chains: Subunit associations in multimeric proteins are typically maintained solely by noncovalent forces, and therefore most multimeric proteins can usually be dissociated by exposure to pH extremes, 8 M urea, 6 M guanidinium hydrochloride, or high salt concentrations. Cleavage of Disulfide Bridges: Oxidation of a disulfide by performic acid results in the formation of two equivalents of cysteic acid.

SEQUENCING METHODS N-terminal sequencing C-terminal sequencing Prediction from DNA sequence

N-TERMINAL SEQUENCING The N-terminal sequencing is done through: Sanger’s method Dansyl chloride method Edman’s degradation method

Sanger’s method Treat with DNFB to form a derivative of amino terminal amino acid. A cid hydrolysis. Extraction of DNP-derivative with organic solvent. Identification of DNP-derivative by chromatography and comparison with standards.

Dansyl chloride method Reagent:1-dimethyl aminophthalene-5-sulfonyl chloride ( dansyl chloride) Dansyl polypeptide chain is prepared. Acidic hydrolysis liberates all amino acid and N terminal dansyl amino acid. Amino acids are separated. Fluorescence of dansyl amino acid is detected. Types of amino acid is obtained from comparison with standard dansylated amino acids .

Edman ‘s degradation method Principle : It sequentially remove one residue at a time from amino end of a peptide. Mechanism : Phenyl isothiocyanate is reacted with uncharged N-terminal amino group to form phenylthiocarbamoyl derivative. Then under acidic conditions it is cleaved to form thiazolinone derivative. This thiazolinone derivative is extracted into organic solvent and treated with acid to form more stable phenylthiohydantoin that can be identified using chromatography .

C-TERMINAL SEQUENCING Add carboxypeptidases to a solution of protein. Take sample at regular intervals. Determine the terminal amino acid by analyzing a plot of amino acid concentration against time .

DNA Sequencing Protein sequence can also be determined indirectly from mRNa Design primers from the amino acid sequene and amplify the gene. Sequence the gene and determine the amino acid sequence of proteins.

MASS SPECTROMETRY It is an important method for accurate mass determination and characterization of protein . Basic Principle : This technique basically studies the effect of ionizing energy on molecules . It depends upon chemical reactions in the gas phase in which sample molecules are consumed during the formation of ionic and neutral species. Components: The instrument consists of three major components: Ion source : For producing gaseous ions from the substance being studied. Analyzer : For resolving the ions into their characteristics mass components according to their mass to charge ratio. Detector system : For detecting the ions and recording the relative abundance of each of resolved ionic species.

BIOINFORMATICS TOOLS Bioinformatics : The collection, classification ,storage and analysis of biochemical and biological information using computers especially as applied to moleculer genetics and genomics. It is an interdisciplinary field that develops method and software tools for understanding biological data. It combines biology, computer, science, information engineering, mathematics and statistics to analyze and interpret biological data.

MASTER LAYOUT FOR PROTEIN SEQUENCING

On the basis of number of comparing sequencing strand, it is of two types: Pairwise alignment Multiple alignment Types

Pairwise Sequence Alignment Pairwise sequences alignment only compares two sequences at a time. a b a c d a b _ c d Optimality is based on SCORE. A pairwise alignment consist of series of paired bases, one base from each sequence. There are three types of pairs: Matches : the same nucleotide appears in both sequence. Mismatches : different nucleotides are found in two sequences. Gaps: a base in one sequence and null base in the other.

ALGORITHM used are Needleman- Wunsh algorithm and the Smith-Waterman algorithm. BLAST ( B asic L ocal A lignment S earch T ool) BLAST encompasses many different implementations and enhancements to a search algorithm that finds “ High Scoring Pairs ” of sequence alignment in databases. It is a Fast way to find similar sequences. It is not the most sensitive way to search. It is by a wide margin the most commonly used tool in bioinformatics .

BLAST Steps Seeding: Prepare a list of short, fixed length segments from the query. Searching : Find highly similar or exact match for each word. Extension: Extend each match to a longer match. Evaluation: Evaluation the results using E values .

Multiple Sequence Alignment Multiple Sequence Alignment can be seen as a generalization of Pairwise Sequence Alignment . Instead of aligning just two sequences , three or more sequences are aligned simultaneously. a b a c d a b _ c d x b a c e MSA is used for: Detection of conserved domains in a group of genes or proteins. Construction of a phylogenetic tree. Prediction of protein structure. Determination of consensus sequences.

CLUSTAL A popular heuristic algorithm is CLUSTAL, by Des Higgins and Paul Sharp (1988) CLUSTAL makes a global multiple alignment using a “progressive alignment” approach. First computes all pairwise alignments and calculates sequence similarity between pairs. These similarities are used to build a rough guide tree.

Basic Information Comes From Sequence One sequence -can get some information eg -amino acid properties. More than one sequence- get more info on conserved residues , fold and function. Multiple alignments of related sequence- can build up consensus sequences of known families , domains , motifs or sites. Sequence alignments can give information on loops, families and function from conserved regions.

APPLICATIONS OF PROTEIN SEQUENCING Recombinant protein synthesis. Drugs production. Antibiotic production. Functional genomics. Determination of protein folding patterns. In bioinformatics. It plays vital role in proteomics. Used for the prediction of final structure, function and location of protein. To find out location of gene coding for that protein. Genetic diseases. Identification of sequence differences and variations such as point mutations. Revealing the evolution and genetic diversity of sequence and organisms.

THANK YOU

protein sequence analysis

About This Presentation

Slide Content

Tags

Categories

Download

Quick Actions

Statistics

Related Slideshows

protein sequence analysis

About This Presentation

Slide Content

Slide 1

Slide 2

Slide 3

Slide 4

Slide 5

Slide 6

Slide 7

Slide 8

Slide 9

Slide 10

Slide 11

Slide 12

Slide 13

Slide 14

Slide 15

Slide 16

Slide 17

Slide 18

Slide 19

Slide 20

Slide 21

Slide 22

Slide 23

Slide 24

Slide 25

Tags

Categories

Download

Quick Actions

Statistics

Related Slideshows

Pray For The Peace Of Jerusalem and You Will Prosper

Don_t_Waste_Your_Life_God.....powerpoint

VILLASUR_FACTORS_TO_CONSIDER_IN_PLATING_SALAD_10-13.pdf

Fertility awareness methods for women in the society

Chapter 5 Arithmetic Functions Computer Organisation and Architecture

syakira bhasa inggris (1) (1).pptx.......