SAMPLING TECHNIQUES Dr. Yinka Adeniran, FMCPH Lecturer/Consultant
LEARNING OBJECTIVES At the end of this lecture, you should be able to: Identify and describe the common methods of sampling Discuss problems of bias that should be avoided when selecting a sample Select the sampling method most appropriate for the research design being developed
SAMPLING This is the selection of one or more study units from a defined study population.
Questions that need to be answered: What is the group of people (study population) from which a sample is to be taken? How many people need to be included in the sample? How will these people be selected? An ideal sample should be representative of the population from which it is drawn, i.e. it should have all the major characteristics of that population
STUDY POPULATION The study population should be clearly defined, e.g. according to age, sex, and residence. Each study population consists of study units. A study population could consist of persons, villages, institutions, records, equipment, etc. Problem to be studied Study population Study unit Immunisation coverage of children 12-24 months of age in Abakaliki All children 12-24 months of age in Abakaliki One child 12-24 months of age in Abakaliki Environmental sanitation in primary schools of Mushin, Lagos All primary schools in Mushin One primary school in Mushin Participation in the NHIS by private health facilities in Yaba LCDA All private health facilities in Yaba LCDA One private health facility in Yaba LCDA
SAMPLING METHODS Probability sampling methods Non-probability sampling methods
Probability sampling methods Simple random sampling Systematic sampling Stratified sampling Cluster sampling Multistage sampling
NON-PROBABILITY SAMPLING METHODS The Sampling Frame is a listing of all the study units that are contained within the study population If a sampling frame is not available, it is not possible to sample the study units in such a way that the probability for the different units to be selected in the sample is known. In such cases, non-probability sampling techniques are used in taking a sample.
Convenience sampling For the sake of convenience, the study units that happen to be available at the time of data collection are selected into the sample. E.g. interview of all youths gathered at a street viewing centre within Ilepa village, to determine the attitude of teenagers in the village towards VCT. More convenient than taking a random sample of the teenagers in the village Gives a useful idea of their views However, sample may not be representative of the village teenagers
Quota Sampling This is a method that ensures the inclusion of a certain number of sample units from different categories with specific characteristics in the sample, so that the various characteristics are represented. In this method, the investigator includes as many people in each category of study unit as he can find until that quota is filled. E.g. inclusion of 20 patients each from different religious groups in a study on attitudes towards family planning Useful when a convenience sample may not provide the desired balance of study units. However, may still not be representative of the study population
PROBABILITY SAMPLING METHODS These are used to select a sample when the aim of the research is to measure variables and generalise the findings obtained to the total study population. They involve random selection procedures that ensure that each unit of the sample is selected on the basis of chance. All units of the population should have an equal or a known chance of being included in the sample. Probability sampling methods require a listing of all the study units within the population to be studied. This list is referred to as the sampling frame .
Simple Random Sampling The simplest form of probability sampling Steps: Make a numbered list of all the units in the population from which the sample is to be drawn Decide on the size of the sample Select the required number of sampling units through one of the following methods: Balloting Use of table of random numbers E.g. a simple random sample of 50 primary school students from a school population of 250
Systematic sampling Study units are chosen at regular intervals from the sampling frame. The interval that is chosen for selection is called the sampling interval. The number of the first study unit to be chosen is selected through simple random sampling, and then the sampling interval is applied. Sampling fraction = sample size/study population In the last example, that would be 50/250 = 1/5 The sampling interval would therefore be 5
Systematic sampling (II) Advantages: Less time-consuming & easier to carry out than simple random sampling. Disadvantages: Risk of bias – sampling interval may coincide with a systematic variation in the study population
STRATIFIED SAMPLING If it is important that the sample includes representation from various groups of study units with specific characteristics, e.g. residents from rural and urban areas, different classes in a school, then the sampling frame must be divided into groups, or strata , according to these characteristics. Samples of a predetermined size are obtained from each stratum within the study population using another probability sampling method. Stratified sampling is only possible when the proportions or size of each strata that make up the study population are known. The sampling fraction for each of the strata could be the same, i.e. proportionate , or could differ for each strata, i.e. non-proportionate .
Stratified sampling (II) Advantages: Representation of various sub-groups or strata of interest within the study population Disadvantages: Unequal sampling fractions may give a different picture of the situation found from research, when generalising to the study population.
Cluster Sampling This is the selection of groups of study units (clusters) instead of individual study units. It is used: When a complete sampling frame does not exist Sampling units are scattered in groups across a very large area The list of groupings of study units can be easily compiled, e.g. villages, communities, schools Clusters are often geographic units, e.g. villages, communities, or organizational units, e.g. schools, clinics
Multistage sampling A multistage sampling procedure is carried out in stages or phases, and usually involves more than one sampling method. It is used for community-based studies, usually involving large and diverse populations.
Advantages & disadvantages of cluster and multistage sampling methods Advantages: Less time-consuming & easier to carry out than simple random sampling A complete sampling frame for each study unit may not be required Disadvantages: A larger probability that the sample will not be representative of the total study population than in simple random sampling
BIAS IN SAMPLING Bias in sampling is a systematic error in the sampling procedure that leads to a distortion in the results of the study This is as a result of improper sampling procedures that result in the sample not being representative of the study population If probability sampling methods are properly employed, then an important source of bias is non-response. Non response may be due to absence of subjects, or from refusal to respond or cooperate with the interviewer. To reduce the effect of non-response, additional people may be included in the sample during selection It is important in any study to mention the non-response rate, and to discuss how it might have affected the results.
Sources of bias in sampling Non-response Use of volunteers & other non-probability sampling techniques Seasonal bias Selection of easily accessible areas as opposed to relatively inaccessible ones