11-1
1
STATISTICAL APPROACH
POPULATION
Sample Data
Sampled UnitsDecision Making
Information
Sampling Method
Measurement
Statistical Methods
11-2
Sources of Data
▪Primary Sources: The data collector is the one using the
data for analysis
▪Data from a political survey
▪Data collected from an experiment
▪Observed data
▪Secondary Sources: The person performing data analysis
is not the data collector
▪Analyzing census data
▪Examining data from print journals or data published on the
internet.
DCOVA
11-3
Sources of data fall into five categories
◼Data distributed by an organization or
an individual
◼A designed experiment
◼A survey
◼An observational study
◼Data collected by ongoing business
activities
11-4
POPULATION:
Collection of all elements we are
studying and about which we are
trying to draw conclusion.
•Population must be clearly defined.
•Finite or infinite
•May exists physically or not
•Depends on the problem or objective
being considered
11-5
INFORMATION MAY BE NEEDED ON
•Size of population
•Average value of population characteristic
•Proportion of the population having a particular attribute
•Ratio
INFORMATION CAN BE OBTAINED
CENSUS :
•Complete enumeration
•Examination of every element in the population
•Time consuming and costly
SAMPLE : A part of the population selected to draw
inference from it about population.
11-9
PROBABILITY SAMPLING
•Selection of elements from the population is made according to known
probabilities
•Biases are avoided
•Sample data can be evaluated using statistical methods
•Margin of error is known
•Precision of estimates can be ensured by making proper choice of
sample
size.
NON PROBABILITY SAMPLING
In a nonprobability sample, items included are chosen without
regard to their probability of occurrence
11-10
Convenience Sampling
Convenience sampling attempts to obtain a
sample of convenient elements. Often, respondents
are selected because they happen to be in the right
place at the right time.
◼use of students, and members of social
organizations
◼mall intercept interviews without qualifying the
respondents
◼“people on the street” interviews
11-11
CONVENIENCE SAMPLING
◼Select units that are convenient to
contact or happen to be available at the
time of sampling.
•Patients visiting a clinic on a given day
•Vehicles passing through a particular
point
•Customer visiting supermarket between
11.00 to 11.30 A.M
11-12
Judgmental Sampling
Judgmental sampling is a form of convenience
sampling in which the population elements are
selected based on the judgment of the researcher.
◼test markets
◼purchase engineers selected in industrial
marketing research
◼expert witnesses used in court
◼HR manager
11-13
Quota Sampling
Quota sampling may be viewed as two-stage restricted judgmental
sampling.
◼The first stage consists of developing control categories, or quotas,
of population elements.
◼In the second stage, sample elements are selected based on
convenience or judgment.
Population Sample
composition composition
Control
Characteristic Percentage Percentage Number
Sex
Male 48 48 480
Female 52 52 520
____ ____ ____
100 100 1000
11-14
JUDGEMENT SAMPLING OR SAMPLING BY OPINION
-Someone who is well acquainted with the population
decides which units to include (typical units)
QUOTA SAMPLING
Judgment sampling with the constraints that sample
includes a minimum No. of units from each subgroup of the
population
Example: Consumer Preferences for ice creams
Children 400
College Students 300
Working 200
Retired 100
11-15
Snowball Sampling
In snowball sampling, an initial group of
respondents is selected, usually at random.
◼After being interviewed, these respondents are
asked to identify others who belong to the target
population of interest.
◼Subsequent respondents are selected based on
the referrals.
11-17
Simple Random Sampling
◼Each element in the population has a known and
equal probability of selection.
◼Each possible sample of a given size (n) has a known
and equal probability of being the sample actually
selected.
◼This implies that every element is selected
independently of every other element.
11-18
7-18
Probability Sample:
Simple Random Sample
◼Selection may be with replacement
(selected individual is returned to frame
for possible reselection) or without
replacement (selected individual isn’t
returned to the frame).
◼Samples obtained from table of random
numbers or computer random number
generators.
DCOVA
11-19
◼Example: Population = {A,B,C,D,E}
◼N = 5, n = 2
◼Possible samples = With out replacement
AB, AC, AD, AE, BC, BD, BE,CD,CE,DE
11-20
Selection of Simple Random Sample
A listing of all elements of the population (Sampling
Frame)
Selecting elements one at a time
Method Used
Using Slips of papers
Using Random Numbers
11-23
Auditor wishes to sample 20 sales receipts from a
population of 1000 receipts issued during the day
K = Sampling Interval
= 1000/20=50
Applicable:
No Listing of population is available
Natural ordering of units
11-24
Stratified Sampling
◼A two-step process in which the population is
partitioned into subpopulations, or strata.
◼The strata should be mutually exclusive and
collectively exhaustive in that every population
element should be assigned to one and only one
stratum and no population elements should be
omitted.
◼Next, elements are selected from each stratum by a
random procedure, usually SRS.
◼A major objective of stratified sampling is to increase
precision without increasing cost.
11-25
Stratified Sampling
Simple Random sampling is applicable if we
have a homogeneous population
In case of hetrogeneous population SRS gives
imprecise estimates.
In case of Stratified Sampling
Divide the hetrogeneous population into
subgroups called strata such that each
stratum has a small variation within itself.
High variation between strata
From each stratum, a separate random is
selected
11-27
Sample Size
n = Total sample size
Sample size from stratum is proportional to stratum size=
=
=
==
===
=
3
2
1
321
i
i
n
n
n
300n,8000N
1600N,2400N,4000N
n
N
N
n
11-28
7-28
Probability Sample
Cluster Sample
◼Population is divided into several “clusters,” each representative of
the population
◼A simple random sample of clusters is selected
◼All items in the selected clusters can be used, or items can be
chosen from a cluster using another probability sampling technique
◼A common application of cluster sampling involves election exit polls,
where certain election districts are selected and sampled.
Population
divided
into 16
clusters.
Randomly
selected clusters
for sample
DCOVA
11-29
Cluster Sampling
◼The target population is first divided into mutually exclusive and
collectively exhaustive subpopulations, or clusters.
◼Then a random sample of clusters is selected, based on a
probability sampling technique such as SRS.
◼For each selected cluster, either all the elements are included in
the sample (one-stage) or a sample of elements is drawn
probabilistically (two-stage).
◼Elements within a cluster should be as heterogeneous as
possible, but clusters themselves should be as homogeneous as
possible. Ideally, each cluster should be a small-scale
representation of the population.
◼In probability proportionate to size sampling, the clusters
are sampled with probability proportional to size. In the second
stage, the probability of selecting a sampling unit in a selected
cluster varies inversely with the size of the cluster.
11-30
CLUSTER SAMPLING
•Divide the population into groups or clusters
•Select a random sample of clusters
•Survey all units in the selected cluster
For operational convenience, cost reduction, we
group the nearby units to form clusters.
Data on near by units can be collected, easily,
cheaper, faster
Example: 100 items are supplied in 10 cartons
each having 10 items
11-31
PPS SAMPLING
Units do not have equal chance.
Different units have different probability of selection
Probability of selection
size of unit
Example:
Unit No. No. of workers
1 200
2 150
3 100
4 50
Estimate Probability of selection
Select a sample of size 2 units using PPS
11-33
Examine validity of Survey Results
◼Evaluate the purpose of the survey, why
it was condcuted and for whom it was
collected.
◼Whether based on probability or non
probability sample.
◼Examine the types of errors present
11-34
7-34
Evaluating Survey Worthiness
◼What is the purpose of the survey?
◼Is the survey based on a probability
sample?
◼Coverage error – appropriate frame?
◼Nonresponse error – follow up
◼Measurement error – good questions
elicit good responses
◼Sampling error – always exists
DCVA
11-35
◼Coverage error: When certain group of
items are excluded from the sampling
frame. They have no chance of being
selected. It results in selection bias.
◼Nonresponse error : failure to collected
on all items in the sample.
◼Sampling error. Variation from sample
to sample because of chance error.
Margin of error ( )and confidence level.
11-36
◼Measurement Error. Survey relies on
self reported information, mode of data
collection, respondent and
questionnaire used,
◼Any vague question/ flaws in the
questionnaire .
11-37
Types of Survey Errors
◼Coverage error or selection bias
◼Exists if some groups are excluded from the frame and have
no chance of being selected
◼Non response error or bias
◼People who do not respond may be different from those who
do respond
◼Sampling error
◼Variation from sample to sample will always exist
◼Measurement error
◼Due to weaknesses in question design, respondent error, and
interviewer’s effects on the respondent (“Hawthorne effect”)
DCOVA
11-38
7-38
Types of Survey Errors
◼Coverage error
◼Non response error
Sampling error(ME)
◼Measurement error
Excluded from
frame
Follow up on
nonresponses
Random
differences from
sample t sample
Bad or leading
question
(continued)
DCOVA
11-39
Old sampling Problem
◼For the 1936, US Presidential Election,
Literary Digest ( Magazine ) conducted
largest poll ever
◼Sent 10 million ballots to people across the
country.
◼After receiving and tabulating 2.3 million
ballots, confidently predicted that Alf
Landon would an easy winner over Franklin
D. Roosevelt.
11-40
Old Sampling Problem
◼Final results: FDR won with a landslide victory
◼ Landon receiving the fewest electoral votes in US history
◼ It was a watershed event in the history of sample survey
and polls
◼This failure opened the door to new and modern method
sampling.
◼Analysis reaved : Coverage error, ballot were sent of rich
people, from list of automobile and telephone owners.
◼ Response rate only 23% less than 25%