Sampling and sampling techniques Dr. Moumita Pal MBBS,DPH, MD Dept. of Community Medicine College of Medicine and Sagore Dutta Hospital
Sampling It is process of choosing a representative sample from a target population and collecting data from that sample in order to understand something about the population as a whole.
Universe ( Whole population ): entire group of the study population is known as universe or whole population. Represents the complete set of individuals, objects or scores in which we are interested. Sampling unit : each member of the whole population Sampling frame : a list where all individuals from the whole population are drawn up is known as sampling frame. Sample : a small representative part of the whole population
Why sampling? To cut down financial cost for data collection, processing and reporting. To cut short the time and resources Information collected from a sample is accurate i.e. valid and reliable
Target population entire population Adult population Study population subset of target population in field practice area Study subject sample drawn from study population selected adults
Sample size calculation For descriptive study For analytical study
Sampling technique 1. Probability 2. N on probability
Non probability Sample selected deliberately by the researcher on his own choice 1. Purposive sampling (judgmental sampling) : participants are purposively selected from whom information can be obtained easily. 2. Convenience sampling : participants are selected on the basis of easy accessibility. 3. Self selection sampling : participants take part in the research on their own as a volunteer
4. Snowball sampling: ( network sampling) : in this process one study subject is asked to identify persons with the same exposure in question for the purpose of finding the next subjects. (When target population is hidden eg : HIV/AIDS, drug addict, sex worker etc.) 5. Quota sampling : researcher are given quotas to fill from different strata of population keeping the proportions of quota same as observed in the population. Hindu, Muslim - 60%, 40 %, can choose participants by his own choice in same ration
Probability sampling Superior to non probability sampling. Obeys law of probability and base on concept of random selection Known as random sampling or chance sampling
Simple random sampling Each member of population has an equal chance of being chosen ( guarantees the sample chosen is representative of the population) Applicable when sample size is small homogeneous and readily available. Complete list of population must be available as sampling frame First all sampling units are assigned with numbers Then sample can be selected by random number table or lottery method.
Systematic random sampling For large scattered heterogeneous population All sampling units are assigned with number A random starting point chosen 1 st then every n th number has been chosen. n is sample interval = total population/sample size 1 st unit as random and others as systematic nth unit
Stratified random sampling For heterogeneous population ( when we want to know distribution according to particular variable) 1 st heterogeneous group divided into small homogeneous groups: called Strata From each group required number of sample units taken by simple or systematic random sampling in proportion to its original size Strata should be mutually exclusive and exhaustive
Cluster sampling Dividing the population of interest into geographically distinct groups/clusters Used when units of population are natural groups or clusters like blocks, wards, villages, slums etc. If related to geographical area: called Area sampling The 30 cluster sampling technique: 30*7 sample developed by WHO From list of all cluster select 30 clusters= 1 st step Selection of 7 interview site= 2 nd step 2 stage sampling Primary sampling unit/secondary sampling unit
Used for evaluation of immunization coverage of districts, attitude of people towards immunization, contraception, intervention program etc ADVANTAGES: for a large geographical area where list of household is not there, time saving, less costly, sample size is less DISADVANTAGES: gives higher standard error than other sampling design
Selection of cluster from primary sampling units : 1. simple/systematic random sampling 2. probability proportionate to population size (PPS):
Probability proportionate to population size (PPS): List of village, town or wards with respective population/household numbers prepared Say among 30 clusters 10 cluster has to be taken Cumulative population of 30 cluster calculated and divided by 10.= sampling interval (SI) One random no selected by random no table which is equal or less than SI= Random start (RS) The village/ town have cumulative population equals/exceeds the particular selected RS is 1 st cluster
Selection of individual/household 1. simple one stage cluster sample: 1 st stage: cluster selected. 2 nd stage: all units are selected 2. simple two stage cluster sample: 1 st stage: cluster selected, 2 nd stage: simple/SRS 3. multi stage sample: more than 2 stage involved. 1 st stage: cluster selected 2 nd stage: stratified clusters 3 rd stage: simple/SRS
Immunization coverage survey Children between 12-23 months are covered in each cluster Survey continued until 7 children found Total no of fully immunized children: 7*30=210 If all children found then immunization coverage= 210/210*100=100% If say 150 found then 150/210*100= 71.4%
Multistage sampling Carried out in several stages, in large country survey ( anemia /hook worm survey) Any type of probability sampling technique can be applied at each stage India: 5 states: 3 districts: 2 blocks Reduces the work load
Multiphase sampling Part of information is collected from whole sample and part from the subsample 20 fever cases clinical examination+ basic blood tests high ESR widal / MP test Less costly/less laborious
Lot quality assurance sampling(LQAS) The technique was developed in 1920s to control quality of output in industrial production processes. In health sector to identify communities with unacceptably low immunization, worrying level of disease prevalence etc. Does not give the exact prevalence but probability that particular area has inadequate level of immunization or high disease prevalence
Whole district =supervision unit Each community= supervision area Minimum of 19 items from each supervision area is chosen (acceptable error) Sample size of all supervision area =95 or more 5-6 supervision area is ideal
Can be used to assess binary outcomes only Expressed as % of clients who received a service in a defined period of time. Good= maintain program at current level, identify best practices to help other programs Below average= identify reasons, develop solutions Advantage/disadvantage
SAMPLING BIAS : unless the sampling method ensures all members of universe have a chance of selection into sample bias is possible. Best way to avoid is to use probability sampling. DESIGN EFFECT: is a coefficient which reflects how sampling design affects the computation of significance levels compared to simple random sampling . A design effect coefficient of 1.0 means the sampling design is equivalent to simple random sampling. A design effect greater than 1.0 means the sampling design reduces precision of estimate compared to simple random sampling (cluster sampling). A design effect less than 1.0 means the sampling design increases precision compared to simple random sampling (stratified sampling).
True/false In 30*7 cluster sampling 210 children are surveyed Sample is a part of universe Stratified random sampling is applicable in heterogeneous population Sample size in cluster sampling is less than to simple random sampling Simple random sampling is used for scattered heterogeneous population
True about simple random sampling Every person has an equal chance of selection Less no. of sample is obtained Also known as systematic random sampling Groups are not equally distributed
For a survey, a village is divided into 5 lanes, then each lane sampled randomly is an example of: 1. simple random sampling 2. Systematic random sampling 3. Stratified random sampling 4. All of the above
Which is true of cluster sampling: Every nth case is chosen for the study A natural group is taken as sampling unit Stratification of population has been done Involves use of random number
Immunization status in an area is checked by Simple random sampling Systematic random sampling Stratified random sampling Cluster sampling
In a community of 3000 people, 80% Hindu, 10% Muslim, 5% Sikh, 4% Christians and 1% Jain. To select a sample of 300 people to analyze food habits, ideal sampling would be Simple random Stratified random Systematic random Cluster
True about simple random sampling is Each person has a known and equal chance of being selected Number 2 consecutive members are selected Error most frequent Adjacent samples should not be chosen
The cluster sampling technique used in evaluation of UIP coverage 20cluster 5 children 30 cluster 5 children 30 cluster 7 children 30 cluster 10 children
In a village every 5 th house was selected for study. This is which type of sampling Simple random Systematic random Stratified random Any of the above
When part of information collected from whole sample and part from sub sample, it is called Simple random Cluster Multiphasic sampling Multistagic sampling
All are example of probability sampling except Cluster Convenience Sequential Stratified random