Sample size determination

13,044 views 32 slides Feb 26, 2021
Slide 1
Slide 1 of 32
Slide 1
1
Slide 2
2
Slide 3
3
Slide 4
4
Slide 5
5
Slide 6
6
Slide 7
7
Slide 8
8
Slide 9
9
Slide 10
10
Slide 11
11
Slide 12
12
Slide 13
13
Slide 14
14
Slide 15
15
Slide 16
16
Slide 17
17
Slide 18
18
Slide 19
19
Slide 20
20
Slide 21
21
Slide 22
22
Slide 23
23
Slide 24
24
Slide 25
25
Slide 26
26
Slide 27
27
Slide 28
28
Slide 29
29
Slide 30
30
Slide 31
31
Slide 32
32

About This Presentation

A sample design is a definite plan for obtaining a sample from a given population. Researcher must select/prepare a sample design which should be reliable and appropriate for his research study.


Slide Content

SAMPLE SIZE
DETERMINATION
AUGUSTINE GATIMU NJUGUNA
Augustine Gatimu Njuguna-PhD (Epidemiology) Candidate-JKUAT, FUoN
(Epidemiology) Candidate -UoN, FUoN(Health Informatics) –UoN.MSc.
Medical Statistics –UoNBScN –KeMU.

Session Outline
•What is sample size?
•Basic information needed for sample size calculation.
•Why to determine sample size?
•How large a sample do we need?
•What are the methods of determining it?
•What are the factors that affect it?
•Types of measurement in research.
•How do we determine sample size?
•Conclusion

What is a Sample?
•Thisisthesub-population,tobestudiedinordertodrawainference
fromareferencepopulation(apopulationtowhichthefindingsof
theStudyaretobegeneralized).
•InCensus,thesamplesizeisequaltothepopulationsize.
•However,inresearch,becauseoftimeconstraintsandbudget,a
representativesampleisnormallyused.
•Largerthesample,moreaccuratewillbethefindingsfromaStudy.

Cont’d……………
•Availabilityofresourcessetsupperlimitofthesamplesize.
•Requiredaccuracysetslowerlimitofsamplesize.
•Thus,anoptimumsamplesizeisanessentialcomponentofany
research.

Basic Information Needed for
Sample Size Calculation
The approach to sample size calculation can be arrived at by thinking through the
following set of questions:
•What type of study is this?
Single sample (prevalence survey)
Comparison of two groups (cross-sectional, case-control, cohort study)
•What is the main (primary) outcome?
Mean of a measurement (mean blood pressure)
Proportion
Ordered scale (pain scores)
•What is the expected variability between the subjects?
•How large a difference would be considered clinically important and reasonable?

What is sample size determination
•Samplesizedeterminationisthemathematicalestimationofthe
numberofsubjects/unitstobeincludedinastudy.
•Whenarepresentativesampleistakenfromapopulation,thefinding
aregeneralizedtothepopulation.
•Optimumsamplesizedeterminationisrequiredforthefollowing
reasons:
Toallowappropriateanalysis
Toprovidedesiredlevelofaccuracy
Toallowvaliditytothesignificancetest.

How large a sample do we need?
Ifthesampleistoosmall:
1.EvenawellconductedStudymayfailtoanswerit’sresearch
question.
2.Itmayfailtodetectimportanteffectsorassociations.
3.Itmayassociatethiseffectorassociationimprecisely.

Cont’d……………
If the sample size is too large:
1.The Study will be difficult and costly.
2.Time constraint.
3.Loss of accuracy.
Hence, optimum sample size must be determined before
commencement of a Study.

Types of Measurement in Research
•Random error
•Systematic error (bias)
•Precision (reliability)
•Accuracy (Validity)
•Effect size
•Design effect
•Type I(a) error
•Type II (b) error
•Power (1-β)
•Null hypothesis
•Alternative hypothesis

Definition of terms
•Random error: Errors that occur by chance. Sources are sample
variability, subject to subject differences & measurement errors. These
can be reduced by averaging, increasing sample size, repeating the
experiment.
•Systematic error: Deviations not due to chance alone. Several factors,
e.g. patient selection criteria may contribute. It can be reduced by good
study design and conduct of the experiment.
•Precision: The degree to which a variable has the same value when
measured several times. It is a function of random error.
•Accuracy: The degree to which a variable actually represent the true
value. It is function of systematic error.

Cont’d……………
•Power:Thisistheprobabilitythatthetestwillcorrectlyidentifya
significantdifference,effectorassociationinthesampleshouldone
existinthepopulation.Samplesizeisdirectlyproportionaltothe
powerofthestudy.Thelargerthesamplesize,thestudywillhave
greaterpowertodetectsignificancedifference,effectorassociation.
•Effectsize:Isameasureofthestrengthoftherelationshipbetween
twovariablesinapopulation.Thebiggerthesizeoftheeffectinthe
population,theeasieritwillbetofindout.

Cont’d……………
•Designeffect:Geographicclusteringisgenerallyusedtomakethe
studyeasier&cheapertoperform.Theeffectonthesamplesize
dependsonthenumberofclusters&thevariancebetween&within
thecluster.
Inpractice,thisisdeterminedfrompreviousstudiesandisexpressedasa
constantcalled‘designeffect’oftenbetween1.0&2.0.Thesamplesizesfor
simplerandomsamplesaremultipliedbythedesigneffecttoobtainthesample
sizefortheclustersample.

Cont’d……………
•Nullhypothesis:Itstatethatthereisnodifferenceamonggroupsor
noassociationbetweenthepredictor&theoutcomevariable.This
hypothesisneedtobetested.
•Alternativehypothesis:Itcontradictthenullhypothesis.Ifthe
alternativehypothesiscannotbetesteddirectly,itisacceptedby
exclusionifthetestofsignificancerejectsthenullhypothesis.There
aretwotypes;onetail(one-sided)ortwotailed(two-sided)

Cont’d……………
•A type I error occurs if you reject the null hypothesis when it is true.
•A type II error occurs if you do not reject the null hypothesis when it
is false.

Atwhatstagecansamplesizebeaddressed?
•Itcanbeaddressedattwostages:
1.Calculationoftheoptimumsamplesizeisrequiredduringthe
planningstage,whiledesigningtheStudyandinformationonsome
parameters.
2.Atthestageofinterpretationoftheresult.

Approachesforestimatingsamplesize
•Approaches for estimating sample size depend primarily on:
1.The study design &
2.The main outcome measure of the study
There are distinct approaches for calculating sample size for different
study designs & different outcome measures.

Procedure for calculating sample size
•There are 3 procedures that could be used for calculating sample size:
1.Use of formulae
2.Ready made tables
3.Computer soft wares

Sample Size Formula
•Theformularequiresthatwe(i)specifytheamountofconfidencewe
wishtohave,(ii)estimatethevarianceinthepopulation,and(iii)
specifythelevelofdesiredaccuracywewant.
•Whenwespecifytheabove,theformulatellsuswhatsamplesizewe
needtouse….n

Use of formulae for sample size calculation &
power analysis
•Therearemanyformulaeforcalculatingsamplesize&powerin
differentsituationsfordifferentstudydesigns.
•Theappropriatesamplesizeforpopulation-basedstudyis
determinedlargelyby3factors
1.Theestimatedprevalenceofthevariableofinterest.
2.Thedesiredlevelofconfidence.
3.Theacceptablemarginoferror.

Cont’d……………
To calculate the minimum sample size required for accuracy, in estimating
proportions, the following decisions must be taken:
•Decideonareasonableestimateofkeyproportions(p)tobemeasuredinthe
study
•Decideonthedegreeofaccuracy(d)thatisdesiredinthestudy.~1%-5%or
0.01and0.05
•Decideontheconfidencelevel(Z)youwanttouse.Usually95%≡1.96.
•Determinethesize(N)ofthepopulationthatthesampleissupposedto
represent.
•Decideontheminimumdifferencesyouexpecttofindstatisticalsignificance.

For population >10,000.
•n=??????
2
pq/??????
2
n=desiredsamplesize(whenthepopulation>10,000)
Z=standardnormaldeviate;usuallysetat1.96(ora~2),whichcorrespondto
95%confidencelevel.
p=proportioninthetargetpopulationestimatedtohaveaparticular
characteristics.Ifthereisnoreasonableestimate,use50%(i.e.0.5)
q=1-p(proportioninthetargetpopulationnothavingtheparticular
characteristics)
d=degreeofaccuracyrequired,usuallysetat0.05level(occasionallyat2.0)

Example 1
•If the proportion of a target population with certain characteristics is 0.50,
Z statistics is 1.96 & we desire accuracy at 0.05 level, then the sample size
is;-
N=(1.962)(0.5)(0.5)/0.052
N=384.

If study population is < 10,000
nf=n/1+(n)/(N)
•nf= desired sample size, when study population <10,000
•n= desired sample size, when the study population > 10,000
•N= estimate of the population size
Example, if n were found to be 400 and if the population size were
estimated at 1000,
then nf will be calculated as follows
nf= 400/1+400/1000
nf= 400/1.4
nf=286

Sample size formula for comparison of groups
•If we wish to test difference(d) between two sub-samples regarding a
proportion & can assume an equal number of cases(n1=n2=n’) in two
sub samples, the formula for n’ is
n’=2??????
??????
??????pq/??????
??????
•E.g. suppose we want to compare an experimental group against a
control group with regards to women using contraception. If we
expect pto be 40 & wish to conclude that an observed difference of
0.10 or more is significant at the
0.05level, the sample size will be:
n’= 2(1.96)2(0.4)(0.6)/0.12
=184
Thus, 184 experimental subject & another 184 control subjects are
required.

Use of ready made table for sample size calculation
•How large a sample of patients should be followed up if an investigator wishes to
estimate the incidence rate of a disease to within 10% of it’s true value with 95%
confidence?
•The table show that for e=0.10& confidence level of 95%, a sample size of385
would be needed.
•This table can be used to calculate the sample size making the desired changes in
the relative precision & confidence level .e.g. if the level of confidence is reduce to
90%, then the sample size would be 271.
•Such table that give ready made sample sizes are available for different designs &
situation

Use of computer software for sample size
calculation & power analysis
•The following software can be used for calculating sample size & power;
Epi-info
nQuerry
Power & precision
Sample
STATA
SPSS

Epi-info for sample size determination
•In STATCALC:
1 Select SAMPLE SIZE & POWER.
2 Select POPULATION SURVEY.
3 Enter the size of population (e.g. 15 000).
4 Enter the expected frequency (an estimate of the true prevalence,
e.g.80% ±your minimum standard).
5 Enter the worst acceptable result (e.g. 75%) i.e the margin of error is
5%

CONCLUSIONS
1.Sample size determination is one of the most essential components of
every research Study.
2.The larger the sample size, the higher will be the degree of accuracy, but
this is limited by the availability of resources.
3.It can be determined using formulae, ready made tables and computer
soft wares.
Steps:
1.1stFormulatearesearchquestion
2.2ndSelectappropriatestudydesign,primaryoutcomemeasure,
statisticalsignificance.
3.3rdusetheappropriateformulatocalculatethesamplesize.

Thank you