sampling for statistics and population.ppt

rahulborate14 22 views 35 slides Aug 20, 2024
Slide 1
Slide 1 of 35
Slide 1
1
Slide 2
2
Slide 3
3
Slide 4
4
Slide 5
5
Slide 6
6
Slide 7
7
Slide 8
8
Slide 9
9
Slide 10
10
Slide 11
11
Slide 12
12
Slide 13
13
Slide 14
14
Slide 15
15
Slide 16
16
Slide 17
17
Slide 18
18
Slide 19
19
Slide 20
20
Slide 21
21
Slide 22
22
Slide 23
23
Slide 24
24
Slide 25
25
Slide 26
26
Slide 27
27
Slide 28
28
Slide 29
29
Slide 30
30
Slide 31
31
Slide 32
32
Slide 33
33
Slide 34
34
Slide 35
35

About This Presentation

Sampling for statistics


Slide Content

Sampling and Sampling Distributions
Aims of Sampling
Probability Distributions
Sampling Distributions
The Central Limit Theorem
Types of Samples

Aims of sampling
Reduces cost of research (e.g. political
polls)
Generalize about a larger population (e.g.,
benefits of sampling city r/t neighborhood)
In some cases (e.g. industrial production)
analysis may be destructive, so sampling
is needed

Probability
Probability: what is the chance that a given
event will occur?
Probability is expressed in numbers
between 0 and 1. Probability = 0 means
the event never happens; probability = 1
means it always happens.
The total probability of all possible event
always sums to 1.

Probability distributions: Permutations
What is the probability distribution of number
of girls in families with two children?
2 GG
1 BG
1 GB
0 BB

Probability Distribution of
Number of Girls
0
0.1
0.2
0.3
0.4
0.5
0.6
0 1 2

How about family of three?
Num. Girlschild #1child #2child #3
0 B B B
1 B B G
1 B G B
1 G B B
2 B G G
2 G B G
2 G G B
3 G G G

Probability distribution of number of girls
0
0.1
0.2
0.3
0.4
0.5
0 1 2 3

How about a family of 10?
0
0.05
0.1
0.15
0.2
0.25
0.3
0 1 2 3 4 5 6 7 8 9 10

As family size increases, the binomial
distribution looks more and more normal.
Number of Successes
3.02.01.00.0
Number of Successes
10987654321-0

Normal distribution
Same shape, if you adjusted the scales
CA
B

Coin toss
Toss a coin 30 times
Tabulate results

Coin toss
Suppose this were 12 randomly selected
families, and heads were girls
If you did it enough times distribution would
approximate “Normal” distribution
Think of the coin tosses as samples of all
possible coin tosses

Sampling distribution
Sampling distribution of the mean – A
theoretical probability distribution of sample
means that would be obtained by drawing from
the population all possible samples of the same
size.

Central Limit Theorem
No matter what we are measuring, the
distribution of any measure across all
possible samples we could take
approximates a normal distribution, as
long as the number of cases in each
sample is about 30 or larger.

Central Limit Theorem
If we repeatedly drew samples from a
population and calculated the mean of a
variable or a percentage or, those sample
means or percentages would be normally
distributed.

Most empirical distributions are not normal:
U.S. Income distribution 1992

But the sampling distribution of mean income over
many samples is normal
Sampling Distribution of Income, 1992 (thousands)
18 19 20 21 22 23 24 25 26
N
u
m
b
e
r

o
f

s
a
m
p
l
e
s
N
u
m
b
e
r

o
f

s
a
m
p
le
s

Standard Deviation
Measures how spread
out a distribution is.
Square root of the sum
of the squared
deviations of each
case from the mean
over the number of
cases, or
 
N
X
i

2

Deviation from Mean
Amount X (X - X) ( X - X )
600 435 600 - 435 = 16527,225
350 435 350 - 435 = -857,225
275 435 275 - 435 = -16025,600
430 435 430 -435 = -5 25
520 435 520 - 435 = 857,225
0 67,300
( )XX
n



1
s = = = = 129.71
67300
4
,
16825,
2
2
Example of Standard Deviation

Standard Deviation and Normal Distribution

10
8
6
4
2
0
373839 4041 424344 4546
Sample Means
S.D. = 2.02
Mean of means = 41.0
Number of Means = 21
Distribution of Sample Means with 21
Samples
F
r
e
q
u
e
n
c
y

F
r
e
q
u
e
n
c
y
14
12
10
8
6
4
2
0
37 38 39 40 41 42 43 44 45 46
Sample Means
Distribution of Sample Means with 96
Samples
S.D. = 1.80
Mean of Means = 41.12
Number of Means = 96

Distribution of Sample Means with 170
Samples
F
r
e
q
u
e
n
c
y
30
20
10
037 38 39 40 41 42 43 44 45 46
Sample Means
S.D. = 1.71
Mean of Means= 41.12
Number of Means= 170

The standard deviation of the sampling
distribution is called the standard error

Standard error can be estimated from a single sample:
The Central Limit Theorem
Where
s is the sample standard deviation (i.e., the
sample based estimate of the standard deviation of the
population), and
n is the size (number of observations) of the sample.

Because we know that the sampling distribution is normal,
we know that 95.45% of samples will fall within two
standard errors.
95% of samples fall within 1.96
standard errors.
99% of samples fall within
2.58 standard errors.
Confidence intervals

Sampling
Population – A group that includes all the
cases (individuals, objects, or groups) in
which the researcher is interested.
Sample – A relatively small subset from a
population.

Random Sampling
Simple Random Sample – A sample
designed in such a way as to ensure that
(1) every member of the population has
an equal chance of being chosen and (2)
every combination of N members has an
equal chance of being chosen.
This can be done using a computer,
calculator, or a table of random numbers

Population inferences can be made...

...by selecting a representative sample from
the population

Random Sampling
Systematic random sampling – A method
of sampling in which every Kth member (K is
a ration obtained by dividing the population
size by the desired sample size) in the total
population is chosen for inclusion in the
sample after the first member of the sample
is selected at random from among the first K
members of the population.

Systematic Random Sampling

Stratified Random Sampling
Proportionate stratified sample – The size
of the sample selected from each subgroup is
proportional to the size of that subgroup in
the entire population. (Self weighting)
Disproportionate stratified sample – The
size of the sample selected from each
subgroup is disproportional to the size of that
subgroup in the population. (needs weights)

Disproportionate Stratified Sample

Stratified Random Sampling
Stratified random sample – A method of
sampling obtained by (1) dividing the
population into subgroups based on one or
more variables central to our analysis and
(2) then drawing a simple random sample
from each of the subgroups
Tags