Teresa Kisi (MPH in Epidemiology and Biostatistics, Assist. Prof.)
Basic Biostatistics for MPH students
11/23/2018 1
Arsi University, college of health sciences,
Department of Public Health
Sampling techniques and sample size determination
Teresa Kisi (MPH in Epidemiology and Biostatistics, Assist. Prof.)
Sampling
Sample:isagroupofpeople,objects,oritemsthat
aretakenfromalargerpopulationformeasurement.
Thesampleshouldberepresentativeofthe
populationtoensurethatwecangeneralizethe
findingsfromtheresearchsampletothepopulation
asawhole.
2Teresa K.
Teresa Kisi (MPH in Epidemiology and Biostatistics, Assist. Prof.)
Sampling cont’d...
Sampledesign/sampling:isadefiniteplanforobtaining
asamplefromagivenpopulation.Itreferstothe
techniqueortheproceduretheresearcherwould
adoptinselectingitemsforthesample.
Thetermreferstostrategiesthatenableustopicka
subgroupfromalargergroupandthenusethis
subgroupasabasisformakinginferencesaboutthe
largergroup.
Theresearcher'sgoalisalwaystogeneralizeaboutthe
populationbasedonobservationsofthesample.
3Teresa K.
Teresa Kisi (MPH in Epidemiology and Biostatistics, Assist. Prof.)
Sampling cont’d...
Themannerinwhichasampleisdrawnisan
importantfactorindetermininghowusefulthesample
willbeformakinginferencesaboutthepopulation
fromwhichitisdrawn.
Itisquitepossibletohaveaverylargesampleupon
whichnosounddecisioncanbebased.
Thisoccursbecausetherespondentsinthesampleare
notreallysimilartothepopulationaboutwhichwe
wanttomakegeneralizations.
4Teresa K.
Teresa Kisi (MPH in Epidemiology and Biostatistics, Assist. Prof.)
Basic Terms
Referencepopulation(Sourcepopulationortarget
population):isagroupofindividuals/persons,
objects,oritemsfromwhichsamplesaretakenfor
measurement.
Itreferstotheentiregroupofindividualsorobjects
towhichresearchersareinterestedingeneralizing
theconclusions
.
Teresa K. 5
Teresa Kisi (MPH in Epidemiology and Biostatistics, Assist. Prof.)
Basic term cont’d….
6Teresa K.
Teresa Kisi (MPH in Epidemiology and Biostatistics, Assist. Prof.)
Teresa K. 7
Basic term cont’d….
Study population
Target population
Sample
Target, study population and sample
Teresa Kisi (MPH in Epidemiology and Biostatistics, Assist. Prof.)
Basic terms cont’d…
SamplingFrame:isthelistofpeoplefromwhichthe
sampleistaken.
Itshouldbecomprehensive,completeandup-to-
date.
Examplesofsamplingframe: ElectoralRegister;
PostcodeAddressFile;telephonebookandsoon.
8Teresa K.
Teresa Kisi (MPH in Epidemiology and Biostatistics, Assist. Prof.)
Samplingunit-theunitofselectioninthesampling
process
Studyunit-theunitonwhichinformationiscollected.
9
Basic term cont’d...
Teresa K.
Teresa Kisi (MPH in Epidemiology and Biostatistics, Assist. Prof.)
Basic term cont’d…
Thesamplingunitisnotnecessarilythesameasthe
studyunit.
–Iftheobjectiveistodeterminetheavailabilityof
latrine,thenthestudyunitwouldbethe
household;
–Iftheobjectiveistodeterminetheprevalenceof
trachoma,thenthestudyunitwouldbethe
individual.
Samplingfraction(Samplinginterval)–theratioofthe
numberofunitsinthereferencepopulationto
numberofunitsinthesample(N/n)
10Teresa K.
Teresa Kisi (MPH in Epidemiology and Biostatistics, Assist. Prof.)
Sampling errors
Asampleisexpectedtomirrorthepopulationfrom
whichitcomes,however,thereisnoguaranteethatany
samplewillbepreciselyrepresentativeofthe
population.
Theuncertaintyassociatedwithanestimatethatis
basedondatagatheredfromasampleofthe
populationratherthanthefullpopulationisknownas
samplingerror.
Samplingerrorsaretherandomvariationsinthe
sampleestimatesaroundthetruepopulation
parameters.
11Teresa K.
Teresa Kisi (MPH in Epidemiology and Biostatistics, Assist. Prof.)
Sampling error cont’d…
Nosampleistheexactmirrorimageofthepopulation
Samplingerror(chance)
Cannotbeavoidedortotallyeliminated
Samplingerrordecreaseswiththeincreaseinthesize
ofthesample,andithappenstobeofasmaller
magnitudeincaseofhomogeneouspopulation.
Whenn=N⇒samplingerror=0
12Teresa K.
Teresa Kisi (MPH in Epidemiology and Biostatistics, Assist. Prof.)
•Thequestionis,whydosampleestimates
haveuncertaintyassociatedwiththem?
Teresa K. 13
Teresa Kisi (MPH in Epidemiology and Biostatistics, Assist. Prof.)
Sampling error cont’d…
Therearetworeasonsforthisquestion.Theseare:
Estimatesofcharacteristicsfromthesampledata
candifferfromthosethatwouldbeobtainedifthe
entirepopulationweresurveyed.
Estimatesfromonesubsetorsampleofthe
populationcandifferfromthosebasedona
differentsamplefromthesamepopulation(sample
tosamplevariations).
14Teresa K.
Teresa Kisi (MPH in Epidemiology and Biostatistics, Assist. Prof.)
The cause of sampling error
Chance:maincauseofsamplingerrorandistheerror
thatoccursjustbecauseofbadluck.
15Teresa K.
Teresa Kisi (MPH in Epidemiology and Biostatistics, Assist. Prof.)
Non Sampling Error (Measurement Error)
Itisatypeofsystematicerrorinthedesignorconduct
ofasamplingprocedurewhichresultsindistortionof
thesample,sothatitisnolongerrepresentativeofthe
referencepopulation.
Wecaneliminateorreducethenon-samplingerrorby
carefuldesignofthesamplingprocedureandnotby
increasingthesamplesize.
Itcanoccurwhetherthetotalstudypopulationora
sampleisbeingused.
16Teresa K.
Teresa Kisi (MPH in Epidemiology and Biostatistics, Assist. Prof.)
Non Sampling Error (Measurement Error) cont’d…
Itmayeitherbeproducedbyparticipantsinthestudy
orbesimplyabyproductofthesamplingplansand
procedures
Thesecanbeverydevastatingtothefindingsofthe
study.
Example:Ifyoutakemalestudentsonlyfroma
studentdormitoryinEthiopiainordertodetermine
theproportionofsmokersamongtotalstudentsin
Ethiopia,youwouldresultinanoverestimate,since
femalesarelesslikelytosmoke.
Increasingthenumberofmalestudentswould
notremovethebias.
17Teresa K.
Teresa Kisi (MPH in Epidemiology and Biostatistics, Assist. Prof.)
The Cause of Non- Sampling Error
Theinterviewerseffect
Therespondentseffect
Non-response:Itisthefailuretoobtaininformation
onsomeofthesubjectsincludedinthesampletobe
studied.Nonresponseresultsinsignificantbiaswhen
thefollowingtwoconditionsarebothfulfilled.
Whennon-respondentsconstituteasignificant
proportionofthesample(about15%ormore)
Whennon-respondentsdiffersignificantlyfrom
respondents.
18Teresa K.
Teresa Kisi (MPH in Epidemiology and Biostatistics, Assist. Prof.)
Sampling Error cont’d …
Thus,thetotalsurveyerroristhesumofboth
samplingerrorandnon-samplingerror.
19Teresa K.
Teresa Kisi (MPH in Epidemiology and Biostatistics, Assist. Prof.)
Advantage of sampling
Weobtainasampleratherthanacomplete
enumeration(acensus)ofthepopulationformany
reasons:
Feasibilityitmaybetheonlyfeasiblemethodof
collectingdata
Reducedcostsamplingreducesdemandson
resourcesuchasfinance,personalandmaterial
Greateraccuracysamplingmayleadtobetter
accuracyofcollectingdata
Greaterspeeddatacanbecollectedand
summarizedmorequickly
20Teresa K.
Teresa Kisi (MPH in Epidemiology and Biostatistics, Assist. Prof.)
Disadvantage of Sampling
Ifsamplingisbiased,ornotrepresentativeortoosmall
theconclusionmaynotbevalidandreliable
Ifthepopulationisverylargeandtherearemany
sectionsandsubsections,thesamplingprocedure
becomesverycomplicated
Iftheresearcherdoesnotpossessthenecessaryskill
andtechnicalknowhowinsamplingprocedure,thenthe
outcomewillbedevastated.
21Teresa K.
Teresa Kisi (MPH in Epidemiology and Biostatistics, Assist. Prof.)
Characteristics of a good sample design
Fromwhathasbeenstatedabove,wecanlistdownthe
characteristicsofagoodsampledesignas:
Sampledesignmustresultinatrulyrepresentative
sample.
Sampledesignmustbesuchwhichresultsinasmall
samplingerror.
Sampledesignmustbeviableinthecontextoffunds
availablefortheresearchstudy.
Sampledesignmustbesuchthatsystematicbiascanbe
controlledinabetterway.
Sampleshouldbesuchthattheresultsofthesample
studycanbeapplied,ingeneral,fortheuniversewitha
reasonablelevelofconfidence.
Teresa K. 22
Teresa Kisi (MPH in Epidemiology and Biostatistics, Assist. Prof.)
Steps in Sampling Design
Therearestepsthatweneedtofollowtogetintothe
respondents.
Whatisthetargetpopulation?
Definethetargetpopulationandstudypopulation.
Whataretheparametersofinterest?
Definetheparametersofinterestofthestudy.
Whatisthesamplingframe?
Selectthesamplingframe.
Whatistheappropriatesamplingmethod?
Determinewhichsamplingmethodwearegoingto
usedependingonthesettingofthepopulationand
thepurposeofthestudy.
23Teresa K.
Teresa Kisi (MPH in Epidemiology and Biostatistics, Assist. Prof.)
Steps in Sampling Design cont’d..
Planprocedurestoselectthesamplingunit
Determinethesizeofthesamplewhichwillbe
selectedfromthepopulation.
Selectactualsamplingunit
Conductfieldwork
24Teresa K.
Teresa Kisi (MPH in Epidemiology and Biostatistics, Assist. Prof.)
Types of Sampling
Therearemanymethodsofsamplingwhendoing
research.
Oneofthemostimportantdecisionsthatany
researchermakesishowtoobtainthetypeof
participantsneededforthestudy.
Thesamplethatwedrawforourstudydetermines
thegeneralizabilityofourfindings.
25Teresa K.
Teresa Kisi (MPH in Epidemiology and Biostatistics, Assist. Prof.)
Types of Sampling cont’d…
Ontherepresentationbasis,thesamplemaybe:
1.ProbabilitySampling
2.Non-ProbabilitySamplingMethod
Probabilitysamplingisbasedontheconceptof
randomselection,whereasnon-probability
samplingis‘non-random’sampling.
Inprobabilitysamplingmethods,eachpopulation
elementhasaknown(non-zero)chanceofbeing
chosenforthesample.
26Teresa K.
Teresa Kisi (MPH in Epidemiology and Biostatistics, Assist. Prof.)
•Non-probabilitysampling:Withnon-probability
samplingmethods,wedonotknowtheprobability
thateachpopulationelementwillbechosen,and/or
wecannotbesurethateachpopulationelementhas
anon-zerochanceofbeingchosen
Teresa K. 27
Types of Sampling cont’d…
Teresa Kisi (MPH in Epidemiology and Biostatistics, Assist. Prof.)
Types of Sampling Methods
Convenience
Sampling Method
Non-Probability
Sampling
Purposive Judgmental
Probability Sampling
Simple
Random
Systematic random
Stratified random
Cluster random
Multistage Random Sampling
28Teresa K.
Snow ballQuota
Teresa Kisi (MPH in Epidemiology and Biostatistics, Assist. Prof.)
Probability Sampling Method …
Probabilitysamplingstrategiestypicallyusearandom
orchanceprocessinsampling.
The"equalchance"and"independent"componentsof
randomsamplingarewhatmakesusconfidentthat
thesamplehasareasonablechanceofrepresenting
thepopulation
29Teresa K.
Teresa Kisi (MPH in Epidemiology and Biostatistics, Assist. Prof.)
Probability Sampling Method …
Whatdoesitmeantobeindependent?The
researchersselecteachpersonforthestudy
separately.
Letussayyouwereaskedtoparticipateinan
experiment,enjoyedit,andtoldyourfriendsto
contacttheresearchertovolunteerforthestudy.
Thiswouldbeanexampleofnon-independent
sampling.
Teresa K. 30
Teresa Kisi (MPH in Epidemiology and Biostatistics, Assist. Prof.)
1. Simple Random Sampling
Simplerandomsamplingisthemoststraightforwardof
therandomsamplingstrategies.
Weusethisstrategywhenwebelievethatthe
populationisrelativelyhomogeneousforthe
characteristicofinterest.
TouseSRSthereshouldbeframeforthepopulation
31Teresa K.
Teresa Kisi (MPH in Epidemiology and Biostatistics, Assist. Prof.)
Simple Random Sampling cont’d …
Procedurestoselectthesample
Dependingonthecomplexityofthepopulation,we
canusedifferenttoolstoselectnsamplesfromthe
frame.
Thesearelotterymethod,
Tableofrandomnumber(theyareavailableinthe
appendixofmanyresearchmethodsandstatistics
textbooks)or
Computergeneratedrandomnumber(openEpi) .
32Teresa K.
Teresa Kisi (MPH in Epidemiology and Biostatistics, Assist. Prof.)
Simple Random Sampling cont’d …
Lotterymethodisappropriateifthetotalpopulation
isnottoolarge,otherwiseifthepopulationistoo
largethenitwillbeverydifficulttouselottery
method.
Thus,tableofrandomnumberorcomputer
generatedrandomnumberisthefeasiblemethodto
beused.
33Teresa K.
Teresa Kisi (MPH in Epidemiology and Biostatistics, Assist. Prof.)
Example
AssumethatthetotalnumberofpatientswhovisitGondar
UniversityHospitalforthelastsixmonthsisN.Wewanttoseethe
prevalenceofTBamongthosepatientswhovisitedthehospital.
Soifwethingthatthosepatientswhovisitedthehospitalwithin
thespecifiedtimeperiodarehomogeneouswithrespecttothe
variableofinterestandlistofthepatientsareavailable,thenwe
canusesimplerandomsamplingtoselectthesample.
34
Teresa K.
Teresa Kisi (MPH in Epidemiology and Biostatistics, Assist. Prof.)
2. Systematic Random Sampling
Individualsarechosenatregularintervals(forexample,
everyk
th
)fromthesamplingframe.Systematicsamplingisthoughtasrandom,aslongasthe
periodicintervalisdeterminedbeforehandandthe
startingpointisrandom
Amethodofselectingsamplemembersfromalarger
populationaccordingtoarandomstartingpointanda
fixed,periodicinterval.
35Teresa K.
Teresa Kisi (MPH in Epidemiology and Biostatistics, Assist. Prof.)
Systematic Random Sampling cont’d…
Itisfrequentlychosenbyresearchersforits
simplicityanditsperiodicquality.
Tousesystematicsamplingasstrategiestoselectthe
studysubject,itneedsthepopulationtobe
homogeneous,howeverthemethoddoesnot
requireframe.
Hence,intheabsenceofframe,thismethodwillbe
thebestchoice.
Teresa K. 36
Teresa Kisi (MPH in Epidemiology and Biostatistics, Assist. Prof.)
Stepsinsystematicsampling:
Definethepopulation
Determinethedesiredsamplesize(n)
Listthepopulationfrom1toN
DetermineK,wherek=N/n
Selectarandomnumberbetween1andk,letusdenotethis
numberbyaStartingata,takeeveryK
th
numberonthelistuntil
thedesiredsampleisobtained.Thentheselectedlistwillbea,a+k,a+2k,a+3k,…,
37Teresa K.
Teresa Kisi (MPH in Epidemiology and Biostatistics, Assist. Prof.)
38Teresa K.
Teresa Kisi (MPH in Epidemiology and Biostatistics, Assist. Prof.)
39Teresa K.
Teresa Kisi (MPH in Epidemiology and Biostatistics, Assist. Prof.)
Demerits of systematic random sampling
Ifthereisanysortofcyclicpatternintheorderingof
thesubjectswhichcoincideswiththesamplinginterval,
thesamplewillnotberepresentativeofthepopulation.
Examples
Listofmarriedcouplesarrangedwithmen'snames
alternativelywiththewomen'snames(every2
nd
,4
th
,etc.)will
resultinasampleofallmenorwomen.
Ifwewanttoselectarandomsampleofacertainday
(samplingfractiononwhichtocountclinicattendance,this
daymayfallonthesamedayoftheweek,whichmight,for
examplebeamarketday.
40Teresa K.
Teresa Kisi (MPH in Epidemiology and Biostatistics, Assist. Prof.)
3.StratifiedRandomSampling
Stratifiedrandomsamplingisusedwhenwehave
subgroupsinourpopulationthatarelikelytodiffer
substantiallyintheirresponsesorbehavior(i.e.ifthe
populationisheterogeneous).
Instratifiedrandomsampling,thepopulationisfirst
dividedintoanumberofpartsor'strata'accordingto
somecharacteristic,chosentoberelatedtothemajor
variablesbeingstudied.
41Teresa K.
Teresa Kisi (MPH in Epidemiology and Biostatistics, Assist. Prof.)
Stratified Random Sampling cont’d…
Oftenweusedsimplerandomsamplingtoselecta
samplefromeachstrataafterstratification.
42Teresa K.
Teresa Kisi (MPH in Epidemiology and Biostatistics, Assist. Prof.)
Steps involve in stratified sampling method:
Definethepopulation
Determinethedesiredsamplesize
Identifythevariableandsubgroups(strata)forwhich
youwanttoguaranteeappropriaterepresentation
(eitherproportionalorequal)
Classifyallmembersofthepopulationasamemberof
oneoftheidentifiedsubgroups
Randomlyselect(usingsimplerandomsamplingor
others)anappropriatenumberofindividualsfromeach
subgroup.
Thenthetotalsamplesizewillbethesumofall
samplesfromeachsubgroup.
43Teresa K.
Teresa Kisi (MPH in Epidemiology and Biostatistics, Assist. Prof.)
Therearetwomethodstogetthestudysubjectfrom
eachsubgroup,proportionalallocationorequal
allocation.
Weuseproportionalallocationtechniquewhenour
subgroupsvarydramaticallyinsizeinourpopulation
LetNbetotalpopulationandN
1,N
2,…,N
kbethe
subtotalpopulationforstrata1,2,…krespectively.
Moreoverletnbethetotalsamplesizeandn
1,n
2,…n
k
bethesubsampleforstrata1,2,…krespectivelyin
which:
-N=N
1+N
2+…+N
kand
-n=n
1+n
2+…+n
k
44Teresa K.
Teresa Kisi (MPH in Epidemiology and Biostatistics, Assist. Prof.)
Then the subsample n
iwhich will be selected from
subgroup N
ican be computed by:
–Where I = 1,2,3…k
Teresa K. 45
n*N
=
i
i
n
N
Stratified Random Sampling cont’d…
Teresa Kisi (MPH in Epidemiology and Biostatistics, Assist. Prof.)
Stratified Random Sampling cont’d…
Teresa K. 46
N
Stratum1
(N1)
Stratum 2
(N2)
Stratum 3
(N3)
n2n1
n3
…
Total sample = n1+n2+n3+ …
…
If N1=N2=N3=…
Then n1=n2=n3=…
equal allocation
If N1 ≠N2 ≠ N3 ≠ …
n1 ≠ n2 ≠ n3 ≠ …
Proportional allocation is
required
Teresa Kisi (MPH in Epidemiology and Biostatistics, Assist. Prof.)
proportionalallocation:Thehigherthepopulationinthe
subgroup,thehigherthesamplesizewillbe.
However,equalallocationwillbeusedifthetotal
populationfromeachsubgroupisapproximatelyequal.
47
Teresa K.
Teresa Kisi (MPH in Epidemiology and Biostatistics, Assist. Prof.)
Advantage of stratified sampling over simple
random sampling
Therepresentativenessofthesampleisimproved.
Thatis,adequaterepresentationofminority
subgroupsofinterestcanbeensuredbystratification
andbyvaryingthesamplingfractionbetweenstrataas
required.
DEMERIT
Samplingframefortheentirepopulationhastobe
preparedseparatelyforeachstratum.
48Teresa K.
Teresa Kisi (MPH in Epidemiology and Biostatistics, Assist. Prof.)
4.ClusterRandomSampling
Inthissamplingscheme,selectionoftherequired
sampleisdoneongroupsofstudyunits(clusters)
insteadofeachstudyunitindividually.
Thesamplingunitisacluster,andthesamplingframeis
alistoftheseclusters.
Ifthestudycoverswidegeographicalarea,usingthe
othermethodswillbetoocostly.
Theideais,dividedthetotalpopulationintodifferent
clustersandthentheunitofselectionwillbecluster.
Therefore,totalpopulationintheselectedclusterwill
betakenasthesample.
49Teresa K.
Teresa Kisi (MPH in Epidemiology and Biostatistics, Assist. Prof.)
Stepsinclustersamplingare:
Definethepopulation
Determinethedesiredsamplesize
Identifyanddefinealogicalcluster(canbekebele,Got,
residence,andsoon)
Makealistofallclustersinthepopulation
Estimatetheaveragenumberofpopulationnumberper
cluster
Determinethenumberofclustersneededbydividingthe
samplesizebytheestimatedsizeofthecluster
Randomlyselecttherequirednumberofclusters(using
tableofrandomnumberasthetotalnumberofclustersis
manageable)
Includeinthesampleallpopulationintheselectedcluster.
50Teresa K.
Teresa Kisi (MPH in Epidemiology and Biostatistics, Assist. Prof.)
Consider the following graphical display:
51Teresa K.
Teresa Kisi (MPH in Epidemiology and Biostatistics, Assist. Prof.)
5. Multistage Random Sampling
Thisisthemostcomplexsamplingstrategy.
Theresearchercombinessimplersamplingmethodsto
addresssamplingneedsinthemosteffectivewayof
possible.
Example
Supposewewanttoinvestigatetheworkingefficiency
ofnationalizedhealthinstitutionsinIndiaandwewant
totakeasampleoffewhealthinstitutionsforthis
purpose.
52Teresa K.
Teresa Kisi (MPH in Epidemiology and Biostatistics, Assist. Prof.)
Example
Thefirststageistoselectlargeprimarysampling
unitsuchasstatesinacountry.
Thenwemayselectcertaindistrictsandinterview
allhealthinstitutionsinthechosendistricts.
Thiswouldrepresentatwo-stagesamplingdesignwith
theultimatesamplingunitsbeingclustersofdistricts.
53Teresa K.
Teresa Kisi (MPH in Epidemiology and Biostatistics, Assist. Prof.)
Example cont’d …
Ifinsteadoftakingacensusofallhealthinstitutions
withintheselecteddistricts,weselectcertain
townsandinterviewallhealthinstitutionsinthe
chosentowns.
Thiswouldrepresentathree-stagesamplingdesign.
Ifinsteadoftakingacensusofhealthinstitutions
withintheselectedtowns,werandomlysample
healthinstitutionsfromeachselectedtown,
thenitisacaseofusingafour-stagesamplingplan.
Ifweselectrandomlyatallstages,wewillhavewhatis
knownas‘multi- stagerandomsamplingdesign’.
54Teresa K.
Teresa Kisi (MPH in Epidemiology and Biostatistics, Assist. Prof.)
Non-Probability Sampling Method
Inthepresenceofconstraintstouseprobability
samplingstrategies,thealternativesamplingmethodis
non-probabilitysamplingmethod.
Non-probabilitysamplingstrategiesareusedwhenitis
practicallyimpossibletouseprobabilitysampling
strategies.
Non-probabilitysamplingissamplingprocedurewhich
doesnotaffordanybasisforestimatingtheprobability
thateachiteminthepopulationhasofbeingincluded
inthesample.
55Teresa K.
Teresa Kisi (MPH in Epidemiology and Biostatistics, Assist. Prof.)
1. Purposive Sampling
Inpurposivesampling,wesamplewithapurposein
mind.
Whenthedesiredpopulationforthestudyisrareor
verydifficulttolocateandrecruitforastudy,purposive
samplingmaybetheonlyoption.
Forexample,youareinterestedinstudyingcognitive
processingspeedofyoungadultswhohavesuffered
closedheadbraininjuriesinautomobileaccidents.
Thiswouldbeadifficultpopulationtofind.
56Teresa K.
Teresa Kisi (MPH in Epidemiology and Biostatistics, Assist. Prof.)
2. Convenience Sampling
Conveniencesamplingselectsaparticulargroupof
people.But,itdoesnotcomeclosetosamplingallofa
population.
Thesamplewouldgeneralizeonlytosimilarprograms
insimilarcities.
Conveniencesamplinglooksjustlikeclustersampling.
Themajordifferenceisthattheclustersofresearch
participantsareselectedbyconvenienceratherthan
byarandomprocess.
3.JudgmentSampling
Theresearcherselectsthesamplebasedonhis/her
judgment
57Teresa K.
Teresa Kisi (MPH in Epidemiology and Biostatistics, Assist. Prof.)
4. Quota sampling
Isamethodthatensuresacertainnumberofsample
unitsfromdifferentcategorieswithspecific
characteristicsarerepresented.
Inthismethodtheinvestigatorinterviewsasmany
peopleineachcategoryofstudyunitashecanfind
untilhehasfilledhisquota.
Itisthenon-probabilityequivalentofstratified
sampling.Thisdiffersfromstratifiedsampling,where
thestratumsarefilledbyrandomsampling.
Teresa K. 58
Teresa Kisi (MPH in Epidemiology and Biostatistics, Assist. Prof.)
5. Snowball sampling
Itisaspecialnon-probabilitymethodusedwhenthe
desiredsamplecharacteristicisrare.
Snowballsamplingreliesonreferralsfrominitial
subjectstogenerateadditionalsubjects.
Whatweneedtodoincaseofsnowballsamplingis
thatfirstidentifysomeonewhomeetsthecriteriaand
thenlethim/herbringtheothershe/sheknew.
Whilethistechniquecandramaticallylowersearch
costs,itcomesattheexpenseofintroducingbias
becausethetechniqueitselfreducesthelikelihoodthat
thesamplewillrepresentagoodcrosssectionfromthe
population.
59Teresa K.
Teresa Kisi (MPH in Epidemiology and Biostatistics, Assist. Prof.)
Sample Size
Determiningthesamplesizeforastudyisacrucial
componentofstudydesign.
Thegoalistoincludesufficientnumbersofsubjectsso
thatstatisticallysignificantresultscanbedetected.
Amongthequestionsthataresearchershouldask
whenplanningasurveyorstudyisthat"Howlargea
sampledoIneed?“
Theanswerwilldependontheaims,natureandscope
ofthestudyandontheexpectedresult.
Allofwhichshouldbecarefullyconsideredatthe
planningstage.
60Teresa K.
Teresa Kisi (MPH in Epidemiology and Biostatistics, Assist. Prof.)
Sample Size Determination
Ingeneral,samplesizedeterminationdependson:
Thetypeofdataanalysistobeperformed
Thedesiredprecisionoftheestimatesonewishesto
achieve
Thekindandnumberofcomparisonsthatwillbe
made
Thenumberofvariablesthathavetobeexamined
simultaneously
Typeofstudydesignused.
61Teresa K.
Teresa Kisi (MPH in Epidemiology and Biostatistics, Assist. Prof.)
Sample size determination depending on
outcome variables
Therearetwopossiblecategoriesofvariables.
–Thefirstiswherethevariableofinterestis
categoricalwith:
•Onlytwoalternativesresponse:yes/no,
dead/alive,vaccinated/notvaccinatedandsoon,
Or
•Variableswithmultiple,mutuallyexclusive
alternativesresponses,suchasmaritalstatus,
religion,bloodgroupandsoon.
•Forcategoricalvariables,thedataaregenerally
expressedaspercentagesorrates.
•Sowecanusepercentagetocomputethesample
size.
62Teresa K.
Teresa Kisi (MPH in Epidemiology and Biostatistics, Assist. Prof.)
–Thesecondcategorycoverscontinuousresponse
variablessuchasbirthweight,ageatfirstmarriage,
bloodpressureandceriumuricacidlevel,forwhich
numericalmeasurementareusuallymade.
–Inthiscasethedataaresummarizedintheformof
meansandstandarddeviations.
63Teresa K.
Sample Size Determination cont’d…
Teresa Kisi (MPH in Epidemiology and Biostatistics, Assist. Prof.)
Sample Size Determination cont’d…
Thereareseveralapproachestodeterminethesample
size.
Dependingonthetypeofresponsevariable,whetherit
iscategoricalorcontinuous,wewillhavetwosetsof
formulas.
Thesamplesizedeterminationformulascomefromthe
formulasforthemaximumerroroftheestimatesandis
derivedbysolvingforn.
64Teresa K.
Teresa Kisi (MPH in Epidemiology and Biostatistics, Assist. Prof.)
Sample size for continuous variables
(for single population )
Thisistheconditioninwhichtheresearchquestionis
aboutmean.
Standarddeviationofthepopulation: Itisrarethata
researcherknowstheexactstandarddeviationofthe
population.
Typically,thestandarddeviationofthepopulationis
estimated:
Fromtheresultsofaprevioussurvey,
Fromapilotstudy,
Fromsecondarydata,
Fromjudgmentoftheresearcher???.
65Teresa K.
Teresa Kisi (MPH in Epidemiology and Biostatistics, Assist. Prof.)
Maximumacceptabledifference:Thisisthemaximum
amountoferrorthatyouarewillingtoaccept.
Desiredconfidencelevel:isyourlevelofcertaintythat
thesamplemeandoesnotdifferfromthetrue
populationmeanbymorethanthemaximum
acceptabledifference.Commonlyweusea95%
confidencelevel.
Thenthesamplesizedeterminationformulaforsingle
populationmeanisdefinedby:
66Teresa K.
22
/2
2
=
z
n
w
α
σ
Teresa Kisi (MPH in Epidemiology and Biostatistics, Assist. Prof.)
Sample size for single population mean cont’d…
Where:
–α=Thelevelofsignificancewhichcanbeobtainas1-
confidencelevel.
–σ=Standarddeviationofthepopulation
–w=Maximumacceptabledifference
–z
α/2=Thevalueunderstandardnormaltableforthe
givenvalueofconfidencelevel
67Teresa K.
Teresa Kisi (MPH in Epidemiology and Biostatistics, Assist. Prof.)
Sample Size for Single Population Proportion
Thisisthesituationinwhichthevariableofinterest
iscategorical.
Bestestimateofpopulationproportionofthe
variableofinterestshouldbeknown:
68Teresa K.
Sample size for categorical variables
(for single population )
Teresa Kisi (MPH in Epidemiology and Biostatistics, Assist. Prof.)
Thepossiblesourceoftheproportionare:
Fromtheresultsofapreviousstudy,
Fromapilotstudy,
Fromjudgmentoftheresearcher.
Simplybytaking50%
Teresa K. 69
Sample size for categorical variables
(for single population ) cont’d…
Teresa Kisi (MPH in Epidemiology and Biostatistics, Assist. Prof.)
Thentheformulaforthesamplesizeofsingle
populationproportionisdefinedas:
70Teresa K.
2
/2
2
z * (1 )
=
pp
n
w
α
−Where:
•α=thelevelofsignificancewhichcanbe
obtainedas1-confidencelevel
•P=bestestimateofpopulationproportion
•W=maximiumacceptabledifference
•Z
α/2=thevalueunderstandardnormaltable
forthegivenvalueofconfidencelevel
Teresa Kisi (MPH in Epidemiology and Biostatistics, Assist. Prof.)
Example 1
OneofMPHstudentwanttoconductaresearchontheprevalence
ofANCutilizationofmothersinDABATdistrict.Giventhatthe
prevalencefromthepreviousstudyfoundtobe45.7%,whatwillbe
thesamplesizeheshouldtaketoaddresshisobjective?
Solution:
Marginoferrorw=5%
Aconfidencelevelof95%willgivethevalueofZα/2=1.96.
Thenusingtheformula :
71
()
382
05.0
)543.0(457.096.1
05.0
)457.01(457.0)1(
2
2
2
2
2
05.0
2
2
2
=
=
−
=
−
=
Z
W
PPZ
n
α
Teresa K.
Teresa Kisi (MPH in Epidemiology and Biostatistics, Assist. Prof.)
•The final sample size would be corrected for:
–Objectives
–Nonresponse,losttofollowup,lackof
compliance,andsoon
–Considerthetotalsizeofthepopulation(N):if
N<10,000thenweneedcorrectionformulawhich
isdefinedas:
Teresa K. 72
Some Considerations after calculation
=
1
o
f
o
n
n
n
N
+
Teresa Kisi (MPH in Epidemiology and Biostatistics, Assist. Prof.)
Where:n
f=finalsamplesize,n
o=totalsample
sizefromtheaboveformulaandN=total
population
–Takethedesigneffectintoaccountifneeded(for
clusterandmultistagesampling)
Teresa K. 73
Some Considerations after calculation cont’d…
Teresa Kisi (MPH in Epidemiology and Biostatistics, Assist. Prof.)
Assignment 1
Explainwhenandhowtoconsiderlostto
followupandlackofcomplianceduring
samplesizecalculation!!
Teresa K. 74
Teresa Kisi (MPH in Epidemiology and Biostatistics, Assist. Prof.)
exercise 1
Midwiferygraduatestudentwantstodoherthesiswork
onthetitle“assessmentoftheoutcomeofpregnancy
amongwomenwhovisitedAsellahospitalgynecology
andobstetricswardfortheyear2013”
Whatwillbethesamplesizesheshouldtakeforthis
study?
75Teresa K.
Teresa Kisi (MPH in Epidemiology and Biostatistics, Assist. Prof.)
Sample size for two population proportion
Heretheobjectiveofthestudyistocheckwhether
thereissignificantdifferencebetweentwoproportions
comingfromtwodifferentpopulation.
Letusdenote:
–p
1=currentestimateofpopulationproportion
P
1(fornonexposedorcontrolgroup)
–q
1=1-p
1
–P
2=currentestimateofpopulationproportion
P
2(forexposedortreatedgroup)
–q
2=1-p
2
–
P
3=estimatedaverageofp
1andp
2
76Teresa K.
Teresa Kisi (MPH in Epidemiology and Biostatistics, Assist. Prof.)
–Q
3=estimatedaverageofq
1andq
2
–Z
α=theZvaluecorrespondingtothealpha.Thisis
thevalueattwotailed.
–Z
β=theZvaluecorrespondingtothebetaerror.
Thezvalueforbetaisalwaysbasedononetailed
test.Soifbetais0.05,0.1,0.2,or0.3.thenthe
correspondingZvaluesare1.65,1.28,0.85and
0.52respectively.
Teresa K. 77
Sample size for two population cont’d…
Teresa Kisi (MPH in Epidemiology and Biostatistics, Assist. Prof.)
Then the formula for sample size is defined
as:
Teresa K. 78
Sample size for two population cont’d…
2
33 11 22
12
2
=
()
z pq z pq pq
n
pp
αβ
++
−
Teresa Kisi (MPH in Epidemiology and Biostatistics, Assist. Prof.)
Example 1
Aninvestigatorwantstodetermineifthemortalityrate
incalvesraisedbyfarmer'swivesdiffersfromthe
mortalityrateincalvesraisedbyhiredmanagers. He
hypothesizesacalfmortalityrateof0.25forcalves
raisedbyfarmer'swifeand0.40forcalvesraisedby
hiredmanagers.
Thelevelofsignificance,alpha,isstatedtobe0.01,and
thedesiredpowerofthetestis0.95.Howmanycalves
shouldbeincludedinthestudy?
79Teresa K.
Teresa Kisi (MPH in Epidemiology and Biostatistics, Assist. Prof.)
Solution:Fromthegiveninformation,therequired
samplesizecanbecomputedusingthefollowingas:
n =
n ≈346
Teresa K. 80
2
33 11 22
12
2
=
z pq z pq pq
n
pp
αβ
++
−
Teresa Kisi (MPH in Epidemiology and Biostatistics, Assist. Prof.)
Lettwonormalpopulationshavemeansµ
1andµ
2
andstandarddeviationsσ
1andσ
2,Theusualsample-
sizeformulafortwoindependentsamplesneededto
detectadifference(µ
1−µ
2)inmeanswithTypeI
error(α)andpower1−βisgivenby:
Teresa K. 81
Sample size for two population mean
222
/2 1 2
2
12
( + z ) * ( + )
=
( -)
z
n
αβ
σσ
µµ
Teresa Kisi (MPH in Epidemiology and Biostatistics, Assist. Prof.)
Example 2
Aninvestigatorwantstodetermineiftheweightof
babiesdeliveredofmothersonANCfollowupis
significantlydifferentfromweightofbabiesdeliveredof
mothersnotonANCfollowup.Hegottheaverage
weightofbabiesfrommothersonANCas3kgwith
0.05kgstandarddeviationandmeanof2.5kgwith
standarddeviationof0.9kgformothersnotonANC.
Thelevelofsignificance(α)isstatedtobe0.05,andthe
desiredpowerofthetestis0.90.Howmanychildren
shouldbeincludedinthestudy?
82Teresa K.
Teresa Kisi (MPH in Epidemiology and Biostatistics, Assist. Prof.)
Solution:Fromthegiveninformation,therequired
samplesizecanbecomputedusingthefollowingas:
Teresa K. 83
222
/2 1 2
2
12
( + z ) * ( + )
=
( -)
z
n
αβ
σσ
µµ
2 22
2
(1.96 + 1.28) * (0.05 + 0.9)
=
(2.5 -3)
n
N=
2
10.5 * 0.812 8.53
= = 35
(0.5) 0.25
n ≈
Teresa Kisi (MPH in Epidemiology and Biostatistics, Assist. Prof.)
OneofMPHstudentwantstoconductaresearchonthe
“AssessmentofANCutilizationofmothersand
associatedfactorsinAsellatown.”Shegotfrom
previousstudyANCutilizationrateof44.3%and
distanceofmothers’homefromnearesthealth
institutionwassignificantlyassociatedwiththe
utilizationrateat5%significancelevelasindicatedin
thetablebellow:
84
Example 3 , Sample size calculation for objectives
Teresa Kisi (MPH in Epidemiology and Biostatistics, Assist. Prof.)
ANC utilization
Distance of
mothers’
home from
health
institution Yes No AORP-value
Less than 2Km230 212 2.100.000
2 and more Km213 345 1
95%CI of AOR:1.35, 2.27
Teresa Kisi (MPH in Epidemiology and
Biostatistics, Assist. Prof)
85
What will be the sample size she should take for this
study?
Example 3, Sample size calculation cont’d…