INTRODUCTION An important goal of genomics and proteomics is to determine if a particular sequence is like another sequence. This is accomplished by comparing the new sequence with sequences that have already been reported and stored in a database. This process is principally one that uses alignment procedures to uncover the “like” sequence in the database. The alignment process will uncover those regions that are identical or closely similar and those regions with little (or any) similarity. Two alignment types are used: global and local .
BLAST BLAST stands for B asic L ocal A lignment S earch T ool BLAST was developed by Stephen Altschul , Warren Gish , Webb Miller , Eugene Myers , and David J. Lipman at NCBI in 1990. It is a local alignment tool. It helps to find regions of local similarity between sequences. It is a program compares nucleotide or protein sequences to sequence databases and calculates the statistical significance of matches. BLAST can be used to infer functional and evolutionary relationships between sequences as well as help identify members of gene families.
NCBI HOMEPAGE
NCBI-BLAST HOMEPAGE
TYPES
STEPS Specifying A Sequence Of Interest Selecting BLAST Program Selecting Database Selecting Optional Parameters Selecting Formatting Parameters
PROCESS The first step of the BLAST algorithm is to break the query into short words of a specific length. For example, twelve amino acids near the amino terminal of the Aradbidopsis thaliana protein phosphoglucomutase sequence are: NYLENFVQATFN This sequence is broken down into three character words by selecting the first amino acid characters. NYL YLE LEN ENF NFV FVQ VQA QAT ATF TFN These words are then compared against a sequence in a database. For example, word match with rabbit muscle phosphoglucomutase: Query ENF Subject SSTNYAENTIQSIISTVEPAQR
This search is performed for all words. Those words whose T value was greater than 18 were used as to extend the alignment. For every pair of sequences (query and target) that have a word or words in common, BLAST extends the alignment in both directions to find alignments that score greater (are more similar) until the alignment score decreases in value. For example , consider the following alignment between the A. thaliana and rabbit muscle phosphoglucomutase: Query NLYENFVQATFNALTAEKV NY ENF+Q + + + + Subject NYAENTIQSIISTVEPAQR
Once this alignment process is completed for a query and each subject sequence in the database, a report is generated. This report provides a list of those alignments (default size of 50) with a value greater than the S cutoff value. Those alignments whose score is above the cutoff are called a High Scoring Segment Pair (HSP ) . For each alignment reported, an Expect (e) Value is reported.
BLAST OUTPUT The blast output is basically displayed in three ways or formats. Graphical display : shows where the query is similar to other sequences. Hit list : number of sequences similar to query, ranked by similarity. Alignment : every alignment between the query and the reported hits.
BLAST OUTPUT A. GRAPHICAL DISPLAY Query sequence is at the top, with colour key for alignment scores . Each bar represents the portion of another sequence that’s similar to your query sequence :- Red bars- most similar sequence. Pink bars- match less good. Green bars- not impressive match. Blue bars- worst score. Black bars- bad hits.
BLAST OUTPUT B. HIT LIST 1 - This portion of each description links to the sequence record for a particular hit. 2 - Score or bit score is a value calculated from the number of gaps and substitutions associated with each aligned sequence. The higher the score, the more significant the alignment . 3 - E Value (Expect Value) describes the likelihood that a sequence with a similar score will occur in the database by chance. The smaller the E Value, the more significant the alignment 4 - These links provide the user with direct access from BLAST results to related entries in other databases. ‘ L ’ links to Locus Link records and ‘ S ’ links to structure records in NCBI's Molecular Modelling Database.
BLAST OUTPUT C. ALIGNMENT
APPLICATIONS BLAST can be used for several purposes. These include: Identifying Species Establishing Phylogeny DNA Mapping Locating Domains