Unit 4b- Hypothesis testing and confidence intervals (Slides - up to slide 17).pdf

DevangshuMitra2 29 views 31 slides Sep 03, 2024
Slide 1
Slide 1 of 31
Slide 1
1
Slide 2
2
Slide 3
3
Slide 4
4
Slide 5
5
Slide 6
6
Slide 7
7
Slide 8
8
Slide 9
9
Slide 10
10
Slide 11
11
Slide 12
12
Slide 13
13
Slide 14
14
Slide 15
15
Slide 16
16
Slide 17
17
Slide 18
18
Slide 19
19
Slide 20
20
Slide 21
21
Slide 22
22
Slide 23
23
Slide 24
24
Slide 25
25
Slide 26
26
Slide 27
27
Slide 28
28
Slide 29
29
Slide 30
30
Slide 31
31

About This Presentation

Hypothesis Testing


Slide Content

Inferential Statistics -fundamentals
UNIT 4 –PART II
OPRE 6359
1

Introduction
From what we have learned, we know that we can use probability
distributions (Unit 3) to make probability statements about a random variable X.
Also, we learned how to make probability statements about
sample statistics !"(Unit 4).
In most cases, the population parameters are unknown to us. We
must rely in sampling distributions to make conclusions about the
populations.
2

Statistical Inference
There are two types of statistical inference:
-Estimation (Confidence Intervals)
-Hypothesis testing
3

Confidence Interval of the Mean (σknown)
We know that our sample mean ̅$is somewhat close to the
population mean %. If we were building a range of possible values
of the mean, you will expect that the range will be small if the
sample size is large, based on the idea that a large amount of data
should result in a statistic that is very close to the true value ofμ.
4

Confidence Interval of the Mean (σknown)
If our data is normal or our sample is large enough, then we know
that
or it is an approximation to the normal distribution. With some
rearranging, we get
5

Confidence Interval of the Mean (σknown)
If we want to estimate the CI for 95%, we need 0.025 and
0.975 quantiles of the normal distribution, which arez0.025=−1.96 andz0.975=1.96, thus we get
Giving us a 95% confidence interval for !is ̅#±1.96!
"
In general, the formula is ̅#±)!
"
!
"
6

Interpretation
Thus over repeated sampling,100(1−α)%of the resulting intervals
will contain the population meanμbut we don’t know if the
interval we have observed is one that contains the meanμor not.
In this example, we tolerate 5% of error (Significance Level)
7

Confidence Interval of the Mean (σunknown)
A more realistic assumption is that if we don’t know the
population mean μ, we do not know the population standard deviation σ, so we will use the sample standard deviation s
instead.
I want to replaceσwithsbut the sample standard deviationsis
also a random variable and incorporating it into the function might
affect the distribution of X.
This comes with a penalty and a departure from the normal
distribution as we know it.
8

Confidence Interval of the Mean (σunknown)
The resulting random variable has a t-distribution
withn−1degrees of freedom. However as the sample size increases andsbecomes a more reliable estimator ofσ, this
penalty should become smaller.
9

Confidence Interval of the Mean (σunknown)
Notice that as the sample size increases, the t-distribution gets closer and closer to the normal distribution. The formula is:
Degrees freedom df=n-1
Remember that this formula is valid if the sample observations came from a population with a normal distribution or if the sample size is large enough for the Central Limit Theorem to imply that!"is approximately normally distributed.
10

Confidence Interval of the Mean (σunknown)
So the formula that we need uses the sample standard deviation
and the t-distribution to approximate the pattern of this random variable.
̅"±$#$,!
"
&
'
11

Example with R
Suppose we are interested in calculating a95% confidence interval
for the mean hotel expenditures in Manhattan in New Year’s Evening. We collect a random sample of40bills (large enough for
the CLT to be applicable) and observe the following data:
hotels <-data.frame(bills =c(306, 446, 276, 235, 295, 302,
374, 339, 624, 266, 497, 384, 429, 497, 224, 157, 248, 349, 388,
391, 266, 230, 621, 314, 344, 413, 267, 380, 225, 418, 257, 466,
230, 548, 277, 354, 271, 369, 275, 272))
xbar<-mean(hotels$bills)
s <-sd(hotels$bills)
cbind(xbar, s)
cbind(xbar, s)
xbars
[1,] 345.6 108.8527
12

To Assess Normality (Graphically)
normal.data<-data.frame(bills=seq(100,700,length=1000)) %>%
mutate( y = dnorm(bills, mean=xbar, sd=s))
ggplot() + labs(y='density’) + geom_area( data=normal.data,
aes(x=bills, y=y), fill='pink' ) + geom_histogram(data=hotels,
aes(x=bills, y=..density..),binwidth=30, alpha=.6)
13

Calculate Confidence Interval
qt(.975, df=39)
[1] 2.022691
345.6-34.8
[1] 310.8
345.6+34.8
[1] 380.4
We are 95% confident that the true meanμis in this interval.
The process that resulted in this interval will produce intervals
such that 95% of them will contain the meanμ, but we cannot know with certainty if this particular interval contains the
population mean.
14

Bootstrapping -introduction
Bootstrap is a powerful, computer-based method for statistical
inference without relying on too many assumption. We are making no distributional assumptions about where the data came
from.
The approximate population is just an infinite number of copies of
our sample data, then sampling from the approximate population
is equivalent to sampling with replacement from our sample data.
15

Bootstrap Method –in R
SampDist<-mosaic::do(10000) * {
mosaic::resample(hotels) %>%
summarise(xbar=mean(bills))
}
ggplot(SampDist, aes(x=xbar, y=..density..)) +
geom_histogram(col="lightblue")
16
quantile( SampDist$xbar,
probs=c(0.025, 0.975) )
2.5% 97.5%
312.8250 379.9256

Sample size selection
Sometimes a researcher wants to determine the ideal sample size
in order to achieve a specific margin of error. From the Confidence Interval formulas, we can derived an easy formula.
Let the margin of error, which we denoteME, be the half-width
desired in the confidence interval. To do this calculation, we must
also have some estimate of the population standard deviationσ.
17

Hypothesis Testing
The best way to formally test a claim and to identify evidence to
support or reject that claim is by using hypothesis testing.
After identifying the question of interest, the steps for hypothesis
testing are:
1.Set up the hypothesis in three possible ways: two tails, upper
tail, lower tail.
2.Define the significance level or !. Commonly set at 5%
18

Hypothesis Testing
3.Perform a test of the sample statistics using t-distribution
(equivalent to Normal as sample size increases)
4.Use p-value (or critical value) to make a decision
5.Make a conclusion relevant to the problem of interest.
19

Hypothesis as competing statements
We will label the hypothesis being tested asH0which we often
refer to as the “null hypothesis.”
The alternative hypothesis, which we’ll denoteH1, should be the
opposite of the null hypothesis.
They are competing statements, including all the possible
alternatives of the problem. Thus, they are mutually exclusive and
exhaustive.
20

Type I and Type II Errors
There are two ways to make a mistake about your conclusion.
The type I error is to rejectH0when it is true. This error is
controlled byα. We can think ofαas the probability of
rejectingH0when it is true.
However there is a trade off. Ifαis very small then we will fail to
rejectH0in cases whereH0is not true. This is called a type II error
and we will defineβas the probability of failing to rejectH0when
it is false.
21

Concepts of Hypothesis Testing
This trade off between type I and type II errors can be seen by
examining our legal system.
A person is presumed innocent until proven guilty. So the
hypothesis being tested in the court of law are
H0:Defendant is innocent
H1:Defendant is guilty
22

There are two possible outcomes:
◦Convict the defendant èRejecting the null hypothesis in
favor of the alternative hypothesis.
◦Acquit the defendant èNot rejecting the null hypothesis in
favor of the alternative hypothesis.
Concepts of Hypothesis Testing

Concepts of Hypothesis Testing
Truth about the Defendant
Jury Decision
Defendant Acquit
Reject Not Do
0H
Defendant Convict
Reject
0H
Innocent
is Defendant
True
0
H
Guilty
is Defendant
False
0
H
Correct Decision
Correct Decision

Concepts of Hypothesis Testing
25
Truth about the Defendant
Jury Decision
Defendant Acquit
Reject Not Do
0H
Defendant Convict
Reject
0H
Innocent
is Defendant
True
0
H
Guilty
is Defendant
False
0
H
Type I Error
a
Type II Error
b
Correct Decision
Correct Decision

Type I or Type II Error -tradeoff
Our legal system operates under the rule that it is worse to make
a type I mistake (concluding guilty when innocent), than to make a type II mistake (concluding not guilty when guilty).
Critically, when a jury finds a person “not guilty” they are not
saying that defense team has proven that the defendant is
innocent, but rather that the prosecution has not proven the
defendant guilty.
26

Type I or Type II Error -tradeoff
This same idea applies to scientific analysis with theα-level.
Typically we decide that it is better to make a type II error. An experiment that results in a large p-value does not prove
thatH0is true, but that there is insufficient evidence to
concludeH1.
If we still suspect thatH1is true, then we must repeat the
experiment with a larger samples size.
A larger sample size makes it possible to detect smaller
differences.
27

P-Value -Definition
28
We can think of the p-value as a measure of how much evidence
we have for the null hypothesis.
•If the p-value is small, the evidence for the null hypothesis is
small.
•Conversely if the p-value is large, then the data is supporting
the null hypothesis.
•If the p-value drops below a specified threshold (call itα), we
will reject the null hypothesis.

Hypothesis testing example
A light bulb company advertises that their bulbs last for 1000
hours. Consumers will make a complain if the bulbs last less time, but no action will be taken if they last longer. Therefore Consumer
Reports might perform a test and would consider the following
hypotheses:
"#:$≥1000
"$:$<1000
29

Hypothesis testing example
Suppose we perform an experiment withn=28light bulbs and
observe̅*=970ands=64hours. Using an !=0.05,our test
statistic is
1=970−1000
64/28=−30
12.09=2.48
# pt(-2.48, df=27) #no graphmosaic::xpt(-2.48, df=27 ) [1] 0.009834735
30

Hypothesis testing –conclusion
P-value 0.01 <0.05
Therefore, we can reject the null. There is evidence that supports
the alternative.
Consumers are receiving batteries that are lasting less than 1000,
based on the evidence from this sample. Based on this
performance, the company could face complains.
31
Tags