Microsatellite are powerful DNA markers for quantifying genetic variations within & between populations of a species, also called as STR, SSR, VNTR. Tandemly repeated DNA sequences with the repeat/size of 1 – 6 bases repeated several times
Powerful DNA markers for quantifying genetic
variations within & between populations of a
species
Also called as STR, SSR, VNTR
Tandemly repeated DNA sequences with the
repeat/size of 1 – 6 bases repeated several
times
Highly polymorphic; can be analysed with the
help of PCR
Individual alleles at a locus differ in number
of tandem repeats of unit sequence owing to
gain of loss of one or more repeats and they
can be differentiated by electrophoresis
according to their size
Short Tandem Repeats (STRs)Short Tandem Repeats (STRs)
the repeat region is variable between samples while the
flanking regions where PCR primers bind are constant
7 repeats
8 repeats
AATG
Homozygote = both alleles are the same length
Heterozygote = alleles differ and can be resolved
from one another
Based on number of base pairs
1)Mono (e.g. CCCCCCCC or AAAAAA)
2)Di (e.g. CACACACACA)
3)Tri (e.g. CCA CCA CCA CCA)
4)Tetra (e.g. GATA GATA GATA GATA GATA GATA GATA)
Minisatellites: - (9 – 65 base pairs repeated from 2
to several hundred times)
CGCCATTGTAGCCAATCCGGGTGCGATTGCAT CGCCATTGT
AGCCAATCCGGGTGCGATTGCAT CGCCATTGTAGCCAATCCGGG
TGCGATTGCAT CGCCATTGTAGCCAATCCGGGTGCGATTGCAT
CGCCATTGTAGCCAATCCGGGTGCGATTGCAT
Microsatellites – PropertiesMicrosatellites – Properties
Co-dominant
Inherit in Mendelian Fashion
Polymorphic loci with allele number as high as 14 – 15
per locus
Mostly reported from non-coding region, hence can be
independent of selection
Flanking region is highly conserved in related species
Can be obtained from small amounts of tissues [STR
analysis can be done on less than one billionth of a
gram (a nanogram) of DNA (as in a single flake of
dandruff)]
PAGE separation; silver staining/automated genotyping
Abundant in the eukaryote genome (~10
3
to 10
5
loci
dispersed at 7 to 10
100
kilobase pair (kb) intervals)
Microsatellites and Human DiseasesMicrosatellites and Human Diseases
Allele size variations in microsatellite loci in close
proximity (showing linkage disequilibrium) to the following
genes within the Human Major Histocompatibility Complex
(MHC) region
•IDDM (Insulin Dependent Diabetes Mellitus)
•Multiple Sclerosis (MS)
•Narcolepsy
•Uveitis
have been reported to cause genetic disorders; hence these
genetic disorders can be detected by screening the allele
sizes of the microsatellite loci (Type I markers) that are in
close proximity to these genes (Goldstein & Schlotterer,
2001)
Some uses of DNA ProfilingSome uses of DNA Profiling
•Forensic work on crime scenesForensic work on crime scenes
•Parentage testing Parentage testing
•Victim identification in mass disastersVictim identification in mass disasters
•Animal identification- Animal identification- e.g.e.g. racehorses racehorses
•Conservation biology and population genetic Conservation biology and population genetic
studiesstudies
•Bone marrow transplant monitoring - to check Bone marrow transplant monitoring - to check
that the transplanted marrow is still presentthat the transplanted marrow is still present
•Determination of maternal cell contamination Determination of maternal cell contamination
in chorionic villus sampling (used to in chorionic villus sampling (used to
investigate the possibility that a fetus has a investigate the possibility that a fetus has a
severe inherited disease)- is the tissue sample severe inherited disease)- is the tissue sample
really fetal?really fetal?
Parentage Testing – Why?Parentage Testing – Why?
•Parentage - e.g. disputes over
who is the father of a child & is
thus responsible for child
support
•Determining whether twins are
identical or fraternal
•Immigration - establishing that
individuals are the true
children/parents/siblings in
cases of family reunification
Problems:Problems:
§Prior information of the genome is essential as
the primers are highly specific
§Stutter bands
§High sample size
§Mutation in primer binding sites and the region
between primers and repeats can create
problems (null alleles and wrong genotyping)
§Homoplasy
Protocol:Protocol:
•Extraction of DNA
•Primer designing – DNA/genomic libraries
•cross-priming
•PCR Amplification
•Manual Visualization of Microsatellites – PAGE
& Silver staining and genotyping or Automated
genotyping using fluorescent dyes
•Analysis of data & Interpretation of results
GUIDELINES FOR PCR PRIMER DESIGN
Sequence Specificity:
Complimentary to flanking regions of DNA
to be amplified
Primer Length:
Complex enough that there is no likelihood of
finding sequence other than target
Probability of 16 base sequence in a genome is 4
16
(Once in 4 billion)
Minimum length: 17
Ideal length range : 17-30 nucleotides
Primer orientation
•If both the forward and reverse primer sequences are
designed from the sequence data of coding strand of
the DNA (but not for microsatellites), then keep in
mind that the reverse primer has to match the
sequence of the non- coding strand.
5’---ATGGCTTAAGCGGGTATTGCTTAGAA ----3’
3’-- CGAATCTT ----5’
3’---TACCGAATTCGCCCATAACGAATCTT ----5’
5’---ATGGCTTAAG----- 3’
GUIDELINES FOR PCR PRIMER DESIGN
Base Composition:
•G≡C pair has 3 hydrogen bonds - Stronger
pairing - Need more heat than A=T pairs
for separation
•Ideal GC content : 35-60%
•Difference in GC content of 2 primers not
more than 5%
•More GC content leads to poor
denaturation even at 95
o
C
GUIDELINES FOR PCR PRIMER DESIGN
Melting Temperature (T
m
values)
Both primers should have similar T
m
values.
T
m
values of both primers be within 1- 4
o
C and
T
m
values should be in the range of 40-60
o
C
GUIDELINES FOR PCR PRIMER DESIGN AND USE
Calculation of Annealing temperature :
* When using a primer the first time use an
annealing temperature 5
o
C below melting
temperature (T
m
)
.
Guidelines for PCR Primer design and useGuidelines for PCR Primer design and use
Calculation of Annealing temperature :
•A simple formula for estimating the T
m .
T
m
= 2
o
C x (number of A & T residues) +
4
o
C x (number of G & C residues).
Note: Applicable for primers of 15-25 bases
Example:
TGGCTTACGAATCGC 4 x (9) + 2 x (7) –5 = 45°C
Note: Optimal annealing temperature may differ from calculated values.
Specifications on complimentarity Specifications on complimentarity
between sequences of primer pairsbetween sequences of primer pairs
Avoid complimentarity between the
primer pairs
Avoid complimentarity of two or three
bases at the 3’ ends of primer pairs to
reduce primer-dimer formation.
Primer dimer formation
5’ ATGCCTAGC5’ ATGCCTAGCTTCCGGATTTCCGGAT 3’ 3’
3’ 3’
AAGGCCTAAAGGCCTACATTTAGCCTAGT 5’CATTTAGCCTAGT 5’
Specifications on complimentarity Specifications on complimentarity
between sequences of primer pairsbetween sequences of primer pairs
Avoid mismatches between the 3’ end of
the primer and the target-template
sequence.
Avoid complimentary sequences within a
primer.
Specifications on complimentarity Specifications on complimentarity
between sequences of primer pairsbetween sequences of primer pairs
Avoid runs of 3 or more Gs &Cs at the 3’
end.
Avoid a 3’ end T. (Primers with a T at
the 3’ end have a greater tolerance for
mismatches).
Have GC-rich 3’ ends ( “GC clamp”) :
G≡Cs have 3 hydrogen bonds and bind more
strongly than A and T. Several G or C at the 3’
end (elongation end) of primer will make that
end more stable and can increase PCR yield .
Ambiguity/degenerate CodesAmbiguity/degenerate Codes
Forward Strand Reverse Strand
Y = C or T
R = A or G
H = A or C or T
N = A or C or G or T
W = A or T
R = C or T
W = A or T
Guidelines for degenerate - primer designGuidelines for degenerate - primer design
Primer sequence:Primer sequence:
Avoid degeneracy in the last 3 nucleotide at the 3’
end.
To increase primer-template binding efficiency, reduce
degeneracy by allowing some mismatches between the
primer and template, especially towards the 5’ end (but not
the 3’ end).
Guidelines for degenerate - primer designGuidelines for degenerate - primer design
Primer sequence:
Try to design primers with less than 4 fold degeneracy at
any given position.
Begin PCR with a primer concentration of 0.2 mM.
In case of poor PCR efficiency, increase primer concentr-
ation in increments of 0.25 mM until satisfactorily result
is obtained.
Concentration :Concentration :
*Spectrophotometric conversion of primers:
A
260 unit ~ 20-30mg/ml. Use 0.1-0.5 mM of each
primer in PCR. For most applications, 0.2 mM is
sufficient.
Storage :Storage :
Lyophilized primers should be dissolved in a
small volume of distilled water or TE to make a
concentrated stock solution.
Prepare small aliquots of working solutions
containing 10pmol / ml to avoid repeated thawing
and freezing. Store all primer solution at –20
o
C.
Primer quality can be checked on a denaturing
polyacrylamide gel: a single band should be seen.
Guidelines for determining number of PCR cycles.Guidelines for determining number of PCR cycles.
Amount of starting material Single
copy
targets
Number
of PCR
cycles
1-kb DNA
fragment
E.coli DNA Human DNA
0.01-0.11 fg0.05-0.56 pg 36-360 pg 10-100 40-45
0.11-1.1 fg0.56-5.56 pg 0.36-3.6 ng 100-1000 35-40
1.1-55 fg 5.56-278 pg 3.6-179 ng 1 x 10
3
-
5x10
4
30-35
<55 fg <278 pg <179 ng <5 x10
4
25-35
Softwares available on Internet for designing of primersSoftwares available on Internet for designing of primers
•http://www.genome.wi.mit.edu/cgi-bin/primer/primer3_www.cgi
•http://www.hgmp.mrc.ac.uk/
•http://biobase.dk/index.html
•http://bcf.drl.arizona.edu/gcg.html
•http://www.blocks.fhcrc.org/
•http://bioinformatics.weizman.ac.il/blocks/index.html
•http://alces.med.umn.edu/webprimers.html
•http://www.willamstone.com/
•http://doprimer.interactiva.de/
•http://dot.imgen.bcm.tmc.edu:9331/seq_util/seq_util.html
Commercially available computer softwares:
Primer 3; Primer Designer 1.0; Scientific software, 1990; Oligo,
(Rychlikand Rhodes, 1989) can be used for designing.
MicrosatellitesMicrosatellites PCR reaction Mixture & PCR reaction Mixture &
ConcentrationConcentration
Volume per Volume per
reactionreaction
Double distilled waterDouble distilled water 18.018.0mmLL
AssayAssay buffer (10X; Genei, Bangalore, India) buffer (10X; Genei, Bangalore, India)
(100mM Tris, 500mM KCl, 0.1% gelatin, pH9) (100mM Tris, 500mM KCl, 0.1% gelatin, pH9)
(final conc. 1X)(final conc. 1X)
2.52.5mmLL
dNTPs (Genei, Bangalore, India) (200 mM)dNTPs (Genei, Bangalore, India) (200 mM) 2.02.0mmLL
Primer (forward & reverse working solution) Primer (forward & reverse working solution)
(total conc. ~ 10.0 pmoles in 25(total conc. ~ 10.0 pmoles in 25mml of master mix)l of master mix)
0.50.5mmLL
MgClMgCl
2 2
(1.5mM )(1.5mM ) 0.5 0.5 mmll
TaqTaq polymerase (Genei, Bangalore, India) polymerase (Genei, Bangalore, India)
(3Units/ (3Units/ mmll))
0.50.5mmLL
Template DNA (25ng)Template DNA (25ng) 1.01.0mmLL
Total volumeTotal volume25.025.0mmLL
Microsatellite PCRMicrosatellite PCR
Acrylamide (19:1 acrylamide
and bisacrylamide)
:5mL
Double distilled water :2mL
5 x TBE :2mL
10% Ammonium persulphate :70mL
TEMED :3.5mL
10% non- denaturing Polyacrylamide gel 10% non- denaturing Polyacrylamide gel
electrophoresis (PAGE) to separate the PCR electrophoresis (PAGE) to separate the PCR
productsproducts
M 1 2 3 4 5 6 7 8 9 10 11 –ve M
Single Locus Microsatellites Single Locus Microsatellites
(PAGE & Silver staining; (PAGE & Silver staining;
There are five alleles at this locus) There are five alleles at this locus)
M: Standard molecular weight marker (M: Standard molecular weight marker ( pBRpBR322 DNA/322 DNA/MspMspI digest). I digest).
1 -11: Different individuals -ve : Negative control1 -11: Different individuals -ve : Negative control
Microsatellites- multiplexing &Microsatellites- multiplexing &
Use of Fluorescent dyesUse of Fluorescent dyes
Microsatellite multiplexing
A sample print-out for one person, showing all 16 loci
tested. Different colours help with interpretation
Automated GenotypingAutomated Genotyping
FBI’s CODIS DNA DatabaseFBI’s CODIS DNA Database
Combined DNA Index System
•Launched October 1998
•Used for linking serial crimes and
unsolved cases with repeat offenders
•Links all 50 states
•Requires >4 RFLP markers and/or 13
core STR markers
As of June, 2004
•Total profiles = 1,857,093
•Total forensic profiles = 85,477
•Total convicted offender = 1,771,616
Population database
•Look up how often each allele occurs at the locus
in a population (the “allele” frequency)
STRs - Points to be taken care of:
During setting up of PCR
•All the components should be always on ice.
•To avoid contamination, area for PCR set up and gel
running area should be separate.
•Pipetting should be done precisely.
Visualisation of PCR products
•Avoid mixing of different PCR products while loading to the
gel.
•Always load standard DNA molecular weight marker in one
lane of the gel for determination of molecular weight.
•Standardise the running for every new fragment so that all
alleles are resoluted.
GenotypingGenotyping
Individuals with single bands constitute homozygotes while
heterozygotes have double bands. On this basis, each
individual is genotyped as homozygote/heterozygote with
allele designation. The alleles are designated on the basis of
their molecular weights.
Example: An individual showing (i) one band of 100 bp is
genotyped as 100100, and (ii) two bands of molecular
weight 100 and 120 bp is genotyped as 100120.
Analysis of Data
The allelic frequencies of multiple collections from same
river/population during different years were tested for
significant homogeneity(G-test). The genotype data for the
same river/population exhibiting homogeneity were
pooled& the combined data sets to be used for further
analysis.
Genetic Variability:
•Number & Percentage of polymorphic loci
•Observed & effective number of alleles
•Linkage disequilibrium
•Allele frequency
•Heterozygosity – observed & expected
•Private alleles
•Hardy-Weinberg expectation & to test Type I error – Bonferroni correction (in
an indefinitely large randomly mating population, the allelic/gene
frequencies (i.e. proportion of different alleles of a gene) remain unchanged
generation after generation in the absence of mutation, migration,
selection & genetic drift.); ( p + q)2
= 1.
Genetic differentiation/HeterogeneityGenetic differentiation/Heterogeneity
oCo-efficient of genetic differentiation (F
ST
)
(F
ST
is the index of genetic differentiation that describes
how much variation in allele frequencies is present
between the local populations. It is a measure of
population differentiation and ranges from 0 where all the
population have the same allele frequencies at all the loci
to 1.0,where all the populations are fixed for different
alleles at all loci)
oGenetic Distance & Similarity Index
oDendrogram (UPGMA)
oSoftwares used: GENEPOP, GENETIX,
POPGENE, WINBOOT
oInterpretation of Results
Allele frequencies of multiple
collections of the same river to test for
significant genetic homogeneity
using Exact test or G test; found
homogenous (P>0.001), pooled all the
samples of different collections of same
river into one population
Test for linkage disequilibrium (pair-
wise loci); select all the loci that are in
LE (P>0.05); and discard that are in LDE
Null alleles (only for microsatellite-
using MICRO-CHECKER); if not
significant (p<0.05) select those loci for
further analysis
Genetic variability: % polymorphic
loci; obs & effective # of alleles; private
alleles; obs and exp heterozygosity (for
microsatellites and allozymes); average
gene diversity (H) (only for RAPDs)
HWE : (only for microsatellites and
allozymes) F
IS(W&C) in Genepop – If
F
IS +ve – deficiency of heterozygote
& if F
IS
-ve – excess of heterozygote;
in both cases, deviation from HWE
(ideal 0.0000)
Genetic differentiation: Wright’s
F
ST
(pair-wise & overall); and R
ST
(only for microsatellite - allele size)
overall G
ST (only for RAPDs)
Genetic distance & Similarity
Index (pair-wise populations)
UPGMA Dendrogram
AMOVA (only for allozymes &
microsatellites – hierarchical analysis
of genetic variation)
BOTTLENECK (only for allozymes -
IAM & microsatellites - TPM)
DATA ANALYSIS – STEPS INVOLVED
Sampling; Detection of polymorphic markers;
Multiple collections from the same river; Data analysis