4. Sampling Method.pptx good fro biostatics and sampling
TeshaleTekle1
8 views
68 slides
Sep 14, 2025
Slide 1 of 68
1
2
3
4
5
6
7
8
9
10
11
12
13
14
15
16
17
18
19
20
21
22
23
24
25
26
27
28
29
30
31
32
33
34
35
36
37
38
39
40
41
42
43
44
45
46
47
48
49
50
51
52
53
54
55
56
57
58
59
60
61
62
63
64
65
66
67
68
About This Presentation
best
Size: 1.56 MB
Language: en
Added: Sep 14, 2025
Slides: 68 pages
Slide Content
Sampling Methods And Sample Size Determination
Sampling Methods Sampling The process of selecting a portion of the population to represent the entire population. Sample surveys are almost never conducted for the purposes of describing the particular sample under study. Rather they are conducted for the purposes of understanding the larger population from which the sample was initially selected
Why Sampling is needed? To conduct the whole population there is; Cost in terms of money, time and manpower Accessibility challenge Utility e.g. to do diagnostic laboratory test you don’t draw the whole of patient’s blood.
Advantages of sampling Feasibility: Sampling may be the only feasible method of collecting information. Reduced cost : Sampling reduces demands on resource such as finance, personnel, and material. Greater accuracy: Sampling may lead to better accuracy of collecting data Greater speed: Data can be collected and Summarized more quickly
Disadvantages of sampling There is always a sampling error Sampling may create a feeling of discrimination within the population Not advisable where every unit in the population is legally required to have a record. Minority and smallness in number of sub-groups often render study to be suspected (mistrusted) Sampling bias
Sampling error: Errors introduced due to errors in the selection of a sample. They cannot be avoided or totally eliminated 2) Non-sampling error: Observational error Respondent error Lack of preciseness of definition Errors in editing and tabulation of data. Errors in sampling
Sampling error is t he chance and random variation in variables that occurs when any sample is selected from the population Sampling error is to be expected To avoid sampling error, a census of the entire population must be taken To control for sampling error, researchers use various sampling methods Sampling error
Sampling bias Non-random differences which lead to invalid findings. Sources of sampling bias include the use of volunteers and available groups
Classification of Sampling Techniques There are two types of Sampling techniques: probability Sampling Non-probability Sampling Probability Sampling simple random sampling Systematic sampling Stratified sampling Cluster sampling
Probability Sampling Any method of sampling that utilizes some form of random selection. Involves random selection of a sample Every sampling unit has a known and non-zero probability of selection into the sample Involves the selection of a sample from a population based on chance
more complex, more time-consuming and usually more costly than non-probability sampling However, because study samples are randomly selected and their probability of inclusion can be calculated, reliable estimates can be produced and Accurate inferences can be made about the population. Probability sampling is:
Probability sampling There are several different ways in which a probability sample can be selected The method chosen depends on a number of factors, such as The available sampling frame, How spread out the population is, How costly it is to survey members of the population Homogeneity of the target population
Most common probability sampling methods Simple random sampling To use a SRS method: Make a numbered list of all the units in the population Each unit should be numbered from 1 to N (where N is the size of the population) Select the required number of sample size.
The randomness of the sample is ensured by: Use of “lottery’ methods Table of random numbers Computer programs
Limitations of SRS Requires a sampling frame. Difficult if the reference population is dispersed. Minority subgroups of interest may not be selected
Systematic random sampling Sometimes called interval sampling Selection of individuals from the sampling frame systematically rather than randomly Individual samples are taken at regular intervals down the list The starting point is chosen at random
Systematic Random Sampling… Important if the reference population is arranged in some order like: Order of registration of patients in hospital Numerical order of house numbers Student’s registration books Taking individuals at fixed intervals (every ) based on the sampling fraction e.g. if the sample includes 20%, then every fifth.
Steps in systematic random sampling Number the units in the population from 1 to N Decide on the n (sample size) that you want or need k = N/n = the interval size Randomly select an integer between 1 to k Then, take every unit Note : Systematic sampling should not be used when a cyclic repetition is inherent in the sampling frame .
To select a sample of 100 from a population of 400, you would need a sampling interval of 400 ÷ 100 = 4. Therefore, K = 4. You will need to select one unit out of every four units to end up with a total of 100 units in your sample . Select a number between 1 and 4 from a table of random numbers . Example
12/22/2023 If you choose 3, the third unit on your frame would be the first unit included in your sample; The sample might consist of the following units to make up a sample of 100: 3 (the random start), 7, 11, 15, 19...395, 399 (up to N, which is 400 in this case ).
Example 1 Using the above example , you can see that with a systematic sample approach there are only four possible samples that can be selected corresponding to the four possible random starts: A . 1, 5, 9, 13...393, 397 B. 2, 6, 10, 14...394, 398 C. 3, 7, 11, 15...395, 399 D. 4, 8, 12, 16...396, 400 Each member of the population belongs to only one of the four samples an each sample has the same chance of being selected.
The main difference with SRS, any combination of 100 units would have a chance of making up the sample, while with systematic sampling, there are only four possible samples. The use of systematic sampling is more appropriate compared to SRS when a project's budget is tight and requires simplicity in execution and understanding the results of a study.
Stratified random sampling It is done when the population is known to have heterogeneity with regard to some factors and those factors are used for stratification A method of probability Sampling in which the population is divided into different subgroups and samples are selected from each subgroup These subgroups are homogeneous and mutually exclusive groups called strata A population can be stratified by any variable that is available for all units prior to sampling. (e.g., age, sex, province of residence, income, etc.).
Stratified random sampling Divide the population into non-overlapping groups (i.e., strata) N1, N2,N3 ...Ni, such that N1+ N2+ N3+ ... + Ni = N. A separate sample is taken independently from each stratum depending on the type of allocation Elements within each strata are homogeneous , but are heterogeneous across strata . A simple random or a systematic sample is taken from each strata
Why stratification is needed? It can make the sampling strategy more efficient A larger sample is required to get a more accurate estimation if a characteristic varies greatly from one unit to the other
Stratified random sampling There are different sample allocation methods in order to select sample from each strata: 1. Proportional allocation: allocating sampling proportional to the total population of each strata using the formula: Where n=total sample size to be selected N=total population Ni = total population of each strata ni =sample size from each strata
2. Equal allocation: allocating equal sample for each strata Exercice: Proportionna Allocation Village A B C D Total HHs 100 150 120 130 500 S.size ? ? ? ? 60
Cluster sampling Usually, it is too expensive to carry out SRS and conducted when; Population may be large and scattered Complete list of the study population unavailable Travel costs can become expensive if interviewers have to survey people from one end of the country to the other (most widely used to reduce the cost ) The clusters should be homogeneous, unlike stratified sampling where the strata are heterogeneous
A cluster sample is a simple random sample of groups or clusters of elements Useful method when it is difficult or costly to develop a complete list of the population members or When the population elements are widely dispersed geographically Cluster sampling may increase sampling error due to similarities among cluster members.
Example 12/22/2023 In a school based study, we assume students of the same school are homogeneous . We can select randomly sections and include all students of the selected sections only Advantages of cluster sampling Simple as complete list of sampling units within population not required Cost reduction Disadvantages of cluster sampling Cluster members are more likely to be alike than those in another cluster (homogenous) do not have total control over the final sample size.
Non-probability sampling Non probability sampling does not involve random selection Independent of the rationale of probability theory Most sampling methods are purposive in nature because we usually approach the sampling problem with a specific plan in mind In non-probability sampling, every item has an unknown chance of being selected Non-probability sampling strategies are used when it is practically impossible to use probability sampling strategies
Most common non-probability sampling 12/22/2023 Convenience or haphazard sampling Volunteer sampling Judgment sampling Quota sampling Snowball sampling
Convenience Sampling Convenience sampling is sometimes referred to as opportunity ,haphazard or accidental sampling. It is non-probability sampling in which the subject is selected because of their convenient accessibility and proximate to the researche r It is not normally representative of the target population because sample units are only selected if they can be accessed easily and conveniently The method is easy to use, but that advantage is greatly offset by the presence of bias.
Volunteer sampling As the term implies, this type of sampling occurs when people volunteer to be involved in the study. A trials like drug testing, for example, it would be difficult and unethical to enlist random participants from the general public.
Judgment (purposive) Sampling ▪ In this approach a sample is taken on the basis of the researcher knowledge and judgments . ▪ The underlying assumption is that the investigator will select units that are characteristic of the population. ▪ The critical issue here is objectivity: how much can judgment be relied upon to arrive at a typical sample ?
Quota Sampling Sampling is done until a specific number of units (quotas) for various sub-populations have been selected An effective sampling method when data is urgently required and can be conducted without sampling frames. In many cases where the population has no suitable frame, quota sampling may be the only appropriate sampling method.
Snowball Sampling Selecting participants by finding one or two participants and then asking them to refer you to others. To start with, the researcher compiles a short list of sample units from various sources Each of these respondents are contacted to provide names of other probable respondents. Used in studies for respondents who are rare to find.
12/22/2023 Question?
8. Sample size determination
Sample size is the number of study subjects selected to represent a given study population It Should be sufficient to represent the characteristics of interest of the study population In estimating a certain characteristic of a population, sample size calculations are important to ensure that estimates are obtained with required precision or confidence Sample size determination
“ How many subjects should I include in my study?” Which variables should be included in sample size calculation? Should be related to the study’s primary outcome variable If the study have secondary outcome variables which are considered important, The sample size should also be sufficient for the analysis of these variable Common questions
The eventual sample size is usually a compromise between what is desirable and what is feasible.
Objective of the study Design of the study Plan for statistical analysis Accuracy of the measurement to be made Degree of precision required for generalization Sample size determination Depends on
There are three a pproaches to determine the sample size Rules of thumb for determining the sample size Statistical formula Confidence interval approach Hypothesis testing approach
1 . For smaller samples (N < 100), no need of sampling . Survey all the entire population. 2. The rule of thumb approach: eg. 5% of population 4 . Cost basis approach: The number that can be studied depends on the availability of funds 5. If the population size is around 500 , 50% should be sampled. 6. If the population size is around 1500 , 20% should be sampled. 7. Beyond a certain point (N = 5000), the population size is almost irrelevant and a sample size of 400 may be adequate 1. Rules of thumb
2. Statistical formula There are three possible categories of outcome variables 1. Where the variable of interest has only two alternatives response: yes/no, dead/alive, vaccinated/not vaccinated and so on. 2. When the outcome variable with multiple, mutually exclusive alternatives responses, such as marital status, religion, blood group and so on.
For these two categories of outcome variables, the data are generally expressed as percentages or rates. So we can use percentage to compute the sample size .
3. Continuous response variables such as birth weight, age at first marriage, blood pressure and cerium uric acid level, for which numerical measurement are usually made. In this case the data are summarized in the form of means and standard deviations or their derivatives.
Steps in Estimating Sample Size 1. Identify major study variable 2. Determine type of estimate (%, mean, ratio,...) 3. Indicate expected frequency of factor of interest 4. Decide on desired precision of the estimate 5. Decide on acceptable risk that estimate will fall outside its real population value 6. Adjust for population size 7. Adjust for expected response rate
2. Confidence Interval Approach
Confidence Interval Approach
Given confidence interval Hence the absolute precision denoted by d is given as Where s.e is the standard error of the estimator of the parameter of interest.
Estimating a single population mean
Estimating a single population mean The formula requires the knowledge of, population standard deviation for the variable of interest Formula:
Suppose that for a certain group of cancer patients, we are interested in estimating the mean age at diagnosis. We would like a 95% CI of 2.5 years wide . If the population SD is 12 years, how large should our sample be if the population size is large? Example
The formula requires the knowledge of p, the proportion in the population possessing the characteristic of interest. Formula: Sample Size required for single Proportions
Labeled by ME, e, d The margin of error (d) measures the precision of the estimate Small value of d indicates high precision It lies in the interval (0%; 5%] For p close to 50%, d is assumed to be close to 5% For smaller value of p, d is assumed to be lower than 5% Margin of Error
A survey is being planned to determine what proportion of families in a certain area are medically indigent. It is believed that the proportion cannot be greater than 0.35 . A 95% confidence interval is desired error d= 0.05. What sample size should be selected? Example