4. Sampling Method.pptx good fro biostatics and sampling

TeshaleTekle1 8 views 68 slides Sep 14, 2025
Slide 1
Slide 1 of 68
Slide 1
1
Slide 2
2
Slide 3
3
Slide 4
4
Slide 5
5
Slide 6
6
Slide 7
7
Slide 8
8
Slide 9
9
Slide 10
10
Slide 11
11
Slide 12
12
Slide 13
13
Slide 14
14
Slide 15
15
Slide 16
16
Slide 17
17
Slide 18
18
Slide 19
19
Slide 20
20
Slide 21
21
Slide 22
22
Slide 23
23
Slide 24
24
Slide 25
25
Slide 26
26
Slide 27
27
Slide 28
28
Slide 29
29
Slide 30
30
Slide 31
31
Slide 32
32
Slide 33
33
Slide 34
34
Slide 35
35
Slide 36
36
Slide 37
37
Slide 38
38
Slide 39
39
Slide 40
40
Slide 41
41
Slide 42
42
Slide 43
43
Slide 44
44
Slide 45
45
Slide 46
46
Slide 47
47
Slide 48
48
Slide 49
49
Slide 50
50
Slide 51
51
Slide 52
52
Slide 53
53
Slide 54
54
Slide 55
55
Slide 56
56
Slide 57
57
Slide 58
58
Slide 59
59
Slide 60
60
Slide 61
61
Slide 62
62
Slide 63
63
Slide 64
64
Slide 65
65
Slide 66
66
Slide 67
67
Slide 68
68

About This Presentation

best


Slide Content

Sampling Methods And Sample Size Determination

Sampling Methods Sampling The process of selecting a portion of the population to represent the entire population. Sample surveys are almost never conducted for the purposes of describing the particular sample under study. Rather they are conducted for the purposes of understanding the larger population from which the sample was initially selected

Why Sampling is needed? To conduct the whole population there is; Cost in terms of money, time and manpower Accessibility challenge Utility e.g. to do diagnostic laboratory test you don’t draw the whole of patient’s blood.

Advantages of sampling Feasibility: Sampling may be the only feasible method of collecting information. Reduced cost : Sampling reduces demands on resource such as finance, personnel, and material. Greater accuracy: Sampling may lead to better accuracy of collecting data Greater speed: Data can be collected and Summarized more quickly

Disadvantages of sampling There is always a sampling error Sampling may create a feeling of discrimination within the population Not advisable where every unit in the population is legally required to have a record. Minority and smallness in number of sub-groups often render study to be suspected (mistrusted) Sampling bias

Sampling error: Errors introduced due to errors in the selection of a sample. They cannot be avoided or totally eliminated 2) Non-sampling error: Observational error Respondent error Lack of preciseness of definition Errors in editing and tabulation of data. Errors in sampling

Sampling error is t he chance and random variation in variables that occurs when any sample is selected from the population Sampling error is to be expected To avoid sampling error, a census of the entire population must be taken To control for sampling error, researchers use various sampling methods Sampling error

Sampling bias Non-random differences which lead to invalid findings. Sources of sampling bias include the use of volunteers and available groups

Classification of Sampling Techniques There are two types of Sampling techniques: probability Sampling Non-probability Sampling Probability Sampling simple random sampling Systematic sampling Stratified sampling Cluster sampling

Non-probability Quota sampling Judgment sampling Snowball sampling Convenient sampling

Probability Sampling Any method of sampling that utilizes some form of random selection. Involves random selection of a sample Every sampling unit has a known and non-zero probability of selection into the sample Involves the selection of a sample from a population based on chance

more complex, more time-consuming and usually more costly than non-probability sampling However, because study samples are randomly selected and their probability of inclusion can be calculated, reliable estimates can be produced and Accurate inferences can be made about the population. Probability sampling is:

Probability sampling There are several different ways in which a probability sample can be selected The method chosen depends on a number of factors, such as The available sampling frame, How spread out the population is, How costly it is to survey members of the population Homogeneity of the target population

Most common probability sampling methods Simple random sampling To use a SRS method: Make a numbered list of all the units in the population Each unit should be numbered from 1 to N (where N is the size of the population) Select the required number of sample size.

The randomness of the sample is ensured by: Use of “lottery’ methods Table of random numbers Computer programs

Limitations of SRS Requires a sampling frame. Difficult if the reference population is dispersed. Minority subgroups of interest may not be selected

Systematic random sampling Sometimes called interval sampling Selection of individuals from the sampling frame systematically rather than randomly Individual samples are taken at regular intervals down the list The starting point is chosen at random

Systematic Random Sampling… Important if the reference population is arranged in some order like: Order of registration of patients in hospital Numerical order of house numbers Student’s registration books Taking individuals at fixed intervals (every ) based on the sampling fraction e.g. if the sample includes 20%, then every fifth.  

Steps in systematic random sampling Number the units in the population from 1 to N Decide on the n (sample size) that you want or need k = N/n = the interval size Randomly select an integer between 1 to k Then, take every unit Note : Systematic sampling should not be used when a cyclic repetition is inherent in the sampling frame .  

To select a sample of 100 from a population of 400, you would need a sampling interval of 400 ÷ 100 = 4. Therefore, K = 4. You will need to select one unit out of every four units to end up with a total of 100 units in your sample . Select a number between 1 and 4 from a table of random numbers . Example

12/22/2023 If you choose 3, the third unit on your frame would be the first unit included in your sample; The sample might consist of the following units to make up a sample of 100: 3 (the random start), 7, 11, 15, 19...395, 399 (up to N, which is 400 in this case ).

Example 1 Using the above example , you can see that with a systematic sample approach there are only four possible samples that can be selected corresponding to the four possible random starts: A . 1, 5, 9, 13...393, 397 B. 2, 6, 10, 14...394, 398 C. 3, 7, 11, 15...395, 399 D. 4, 8, 12, 16...396, 400 Each member of the population belongs to only one of the four samples an each sample has the same chance of being selected.

The main difference with SRS, any combination of 100 units would have a chance of making up the sample, while with systematic sampling, there are only four possible samples. The use of systematic sampling is more appropriate compared to SRS when a project's budget is tight and requires simplicity in execution and understanding the results of a study.

Stratified random sampling It is done when the population is known to have heterogeneity with regard to some factors and those factors are used for stratification A method of probability Sampling in which the population is divided into different subgroups and samples are selected from each subgroup These subgroups are homogeneous and mutually exclusive groups called strata A population can be stratified by any variable that is available for all units prior to sampling. (e.g., age, sex, province of residence, income, etc.).

Stratified random sampling Divide the population into non-overlapping groups (i.e., strata) N1, N2,N3 ...Ni, such that N1+ N2+ N3+ ... + Ni = N. A separate sample is taken independently from each stratum depending on the type of allocation Elements within each strata are homogeneous , but are heterogeneous across strata . A simple random or a systematic sample is taken from each strata

Why stratification is needed? It can make the sampling strategy more efficient A larger sample is required to get a more accurate estimation if a characteristic varies greatly from one unit to the other

Stratified random sampling There are different sample allocation methods in order to select sample from each strata: 1. Proportional allocation: allocating sampling proportional to the total population of each strata using the formula: Where n=total sample size to be selected N=total population Ni = total population of each strata ni =sample size from each strata  

2. Equal allocation: allocating equal sample for each strata Exercice: Proportionna Allocation Village A B C D Total HHs 100 150 120 130 500 S.size ? ? ? ? 60

Cluster sampling Usually, it is too expensive to carry out SRS and conducted when; Population may be large and scattered Complete list of the study population unavailable Travel costs can become expensive if interviewers have to survey people from one end of the country to the other (most widely used to reduce the cost ) The clusters should be homogeneous, unlike stratified sampling where the strata are heterogeneous

A cluster sample is a simple random sample of groups or clusters of elements Useful method when it is difficult or costly to develop a complete list of the population members or When the population elements are widely dispersed geographically Cluster sampling may increase sampling error due to similarities among cluster members.

Example 12/22/2023 In a school based study, we assume students of the same school are homogeneous . We can select randomly sections and include all students of the selected sections only Advantages of cluster sampling Simple as complete list of sampling units within population not required Cost reduction Disadvantages of cluster sampling Cluster members are more likely to be alike than those in another cluster (homogenous) do not have total control over the final sample size.

Non-probability sampling Non probability sampling does not involve random selection Independent of the rationale of probability theory Most sampling methods are purposive in nature because we usually approach the sampling problem with a specific plan in mind In non-probability sampling, every item has an unknown chance of being selected Non-probability sampling strategies are used when it is practically impossible to use probability sampling strategies

Most common non-probability sampling 12/22/2023 Convenience or haphazard sampling Volunteer sampling Judgment sampling Quota sampling Snowball sampling

Convenience Sampling Convenience sampling is sometimes referred to as opportunity ,haphazard or accidental sampling. It is non-probability sampling in which the subject is selected because of their convenient accessibility and proximate to the researche r It is not normally representative of the target population because sample units are only selected if they can be accessed easily and conveniently The method is easy to use, but that advantage is greatly offset by the presence of bias.

Volunteer sampling As the term implies, this type of sampling occurs when people volunteer to be involved in the study. A trials like drug testing, for example, it would be difficult and unethical to enlist random participants from the general public.

Judgment (purposive) Sampling ▪ In this approach a sample is taken on the basis of the researcher knowledge and judgments . ▪ The underlying assumption is that the investigator will select units that are characteristic of the population. ▪ The critical issue here is objectivity: how much can judgment be relied upon to arrive at a typical sample ?

Quota Sampling Sampling is done until a specific number of units (quotas) for various sub-populations have been selected An effective sampling method when data is urgently required and can be conducted without sampling frames. In many cases where the population has no suitable frame, quota sampling may be the only appropriate sampling method.

Snowball Sampling Selecting participants by finding one or two participants and then asking them to refer you to others. To start with, the researcher compiles a short list of sample units from various sources Each of these respondents are contacted to provide names of other probable respondents. Used in studies for respondents who are rare to find.

12/22/2023 Question?

8. Sample size determination

Sample size is the number of study subjects selected to represent a given study population It Should be sufficient to represent the characteristics of interest of the study population In estimating a certain characteristic of a population, sample size calculations are important to ensure that estimates are obtained with required precision or confidence Sample size determination

“ How many subjects should I include in my study?” Which variables should be included in sample size calculation? Should be related to the study’s primary outcome variable If the study have secondary outcome variables which are considered important, The sample size should also be sufficient for the analysis of these variable Common questions

The eventual sample size is usually a compromise between what is desirable and what is feasible.

Objective of the study Design of the study Plan for statistical analysis Accuracy of the measurement to be made Degree of precision required for generalization Sample size determination Depends on

There are three a pproaches to determine the sample size Rules of thumb for determining the sample size Statistical formula Confidence interval approach Hypothesis testing approach

1 . For smaller samples (N < 100), no need of sampling . Survey all the entire population. 2. The rule of thumb approach: eg. 5% of population 4 . Cost basis approach: The number that can be studied depends on the availability of funds 5. If the population size is around 500 , 50% should be sampled. 6. If the population size is around 1500 , 20% should be sampled. 7. Beyond a certain point (N = 5000), the population size is almost irrelevant and a sample size of 400 may be adequate 1. Rules of thumb

2. Statistical formula There are three possible categories of outcome variables 1. Where the variable of interest has only two alternatives response: yes/no, dead/alive, vaccinated/not vaccinated and so on. 2. When the outcome variable with multiple, mutually exclusive alternatives responses, such as marital status, religion, blood group and so on.

For these two categories of outcome variables, the data are generally expressed as percentages or rates. So we can use percentage to compute the sample size .

3. Continuous response variables such as birth weight, age at first marriage, blood pressure and cerium uric acid level, for which numerical measurement are usually made. In this case the data are summarized in the form of means and standard deviations or their derivatives.

Steps in Estimating Sample Size 1. Identify major study variable 2. Determine type of estimate (%, mean, ratio,...) 3. Indicate expected frequency of factor of interest 4. Decide on desired precision of the estimate 5. Decide on acceptable risk that estimate will fall outside its real population value 6. Adjust for population size 7. Adjust for expected response rate

2. Confidence Interval Approach

Confidence Interval Approach

Given confidence interval Hence the absolute precision denoted by d is given as Where s.e is the standard error of the estimator of the parameter of interest.

Estimating a single population mean

Estimating a single population mean The formula requires the knowledge of, population standard deviation for the variable of interest Formula:

Suppose that for a certain group of cancer patients, we are interested in estimating the mean age at diagnosis. We would like a 95% CI of 2.5 years wide . If the population SD is 12 years, how large should our sample be if the population size is large? Example

The formula requires the knowledge of p, the proportion in the population possessing the characteristic of interest. Formula: Sample Size required for single Proportions

Labeled by ME, e, d The margin of error (d) measures the precision of the estimate Small value of d indicates high precision It lies in the interval (0%; 5%] For p close to 50%, d is assumed to be close to 5% For smaller value of p, d is assumed to be lower than 5% Margin of Error

A survey is being planned to determine what proportion of families in a certain area are medically indigent. It is believed that the proportion cannot be greater than 0.35 . A 95% confidence interval is desired error d= 0.05. What sample size should be selected? Example
Tags