gwas-intro-lombard_11111111111111111.pdf

bhagwatbiotech 5 views 29 slides May 28, 2024
Slide 1
Slide 1 of 29
Slide 1
1
Slide 2
2
Slide 3
3
Slide 4
4
Slide 5
5
Slide 6
6
Slide 7
7
Slide 8
8
Slide 9
9
Slide 10
10
Slide 11
11
Slide 12
12
Slide 13
13
Slide 14
14
Slide 15
15
Slide 16
16
Slide 17
17
Slide 18
18
Slide 19
19
Slide 20
20
Slide 21
21
Slide 22
22
Slide 23
23
Slide 24
24
Slide 25
25
Slide 26
26
Slide 27
27
Slide 28
28
Slide 29
29

About This Presentation

mm


Slide Content

GENOME-WIDE
ASSOCIATION STUDIES ~
GWAS
Zané Lombard; Wits Bioinformatics

Identifying disease genes
 Identifying genes that contribute to disease risk is one
of the main objectives of molecular research
 Such findings have contributed to improvements in
diagnosis, prognosis and therapy
 With the successful identification of disease genes for
many single-gene disorders, the focus has shifted to
diseases with a complex, multifactorial aetiology
 What are the approaches available?

How our genome can influence disease
 Chromosomal abnormalities
 Mutation (direct cause and effect)
Normal red
blood cells
Sickle red blood
cells

How our genome can influence disease
 Multifactorial inheritance

Identifying disease genes – association
 Association refers to the co-occurrence of a
genetic variant with a disease trait, more
frequently than can be readily explained by
chance.
 Two approaches to association studies:
 Candidate gene approach
 Genome-wide approach

© Francis Collins, 2008 © Francis Collins, 2008

© Francis Collins, 2008
© Francis Collins, 2008

How is GWAS possible?
 Decide to genotype 12 million common SNPs
 Collect 1,000 cases and 1,000 controls
 Genotype all DNAs for all SNPs
 That adds up to 24 billion genotypes
 Imagine, this approach cost 50 cents a genotype.
 That’s R12 billion for each disease/trait –
completely out of the question!!

IMPORTANT CONCEPTS
FACILITATING GWAS

SNP discovery
 Most common type of genetic variation
 In latest build of dbSNP (138) there are 62,676,337 SNPs
 SNPs are scattered throughout the genome
 The abundance of SNPs and the ease with which they
can be measured make these genetic variations
significant
locus Allele: A Allele: C
Genotype: AC

Common Disease – Common Variant
Hypothesis
 Common disorders are likely influenced by genetic
variation that is also common in the population.
 Common variants have small effects (low penetrance)
 Multiple variants contribute to disease susceptibility

HapMap and Linkage Disequilibrium

tagSNPs & Indirect Association
 Based on analysis of data from the HapMap project,
>80% of commonly occurring SNPs in European descent
populations can be captured using a subset of 500,000
to 1,000,000 SNPs scattered across the genome.

Technology advances
Affymetrix
Illumina

Designing a GWAS
 Phenotype measures
 Case-control vs Quantitative design
 Sample size & Power
 Population considerations

    
   
   
   
    
   
   
   
   
   
   
   
CASES CONTROLS
Population stratification

    
   
   
   
   
   
CASES CONTROLS
Population Group 1
Population Stratification

    
   
   
   
   
   
CASES
CONTROLS
Population group 2
Population stratification

    
   
   
   
    
   
   
   
   
   
   
   
CASES CONTROLS
Population stratification

ANALYZING GWAS
DATA

 Image files (±400Mb/individual)
 Data files (±600Mb)
What does the data look like?

Analysis of a GWAS
 Quality control
 Statistical testing
 Covariates
 Multiple testing

Quality Control
 Batch effects
 SNP quality control
 Missingness
 HWE
 MAF
 Sample quality control
 Missingness
 Relatedness!
 Population structure

Statistical testing - Association
 Quantitative traits
 Generalized linear model (GLM) approaches
 Case-control
 Logistic regression

Manhattan plots & Regional plots

Multi-locus Analysis: Allele Scoring

Multiple testing
 The cumulative likelihood of finding one or more
false positives over the entire GWAS analysis is
high.
 Bonferroni correction:
 Adjust = 0.05 to = (0.05/k) where k is the
number of statistical tests conducted.
 False-discovery rate
 Permutation testing

Finishing your GWAS work
 Replication
 Meta Analysis
 Functional Studies
Tags