Statistical Inference and Hypothesis Testing

abelyegon7 10 views 46 slides Nov 01, 2025
Slide 1
Slide 1 of 46
Slide 1
1
Slide 2
2
Slide 3
3
Slide 4
4
Slide 5
5
Slide 6
6
Slide 7
7
Slide 8
8
Slide 9
9
Slide 10
10
Slide 11
11
Slide 12
12
Slide 13
13
Slide 14
14
Slide 15
15
Slide 16
16
Slide 17
17
Slide 18
18
Slide 19
19
Slide 20
20
Slide 21
21
Slide 22
22
Slide 23
23
Slide 24
24
Slide 25
25
Slide 26
26
Slide 27
27
Slide 28
28
Slide 29
29
Slide 30
30
Slide 31
31
Slide 32
32
Slide 33
33
Slide 34
34
Slide 35
35
Slide 36
36
Slide 37
37
Slide 38
38
Slide 39
39
Slide 40
40
Slide 41
41
Slide 42
42
Slide 43
43
Slide 44
44
Slide 45
45
Slide 46
46

About This Presentation

Statistical inference and hypothesis testing are core tools in data analysis. Inference draws conclusions about populations from sample data, while hypothesis testing evaluates assumptions using probability. Together, they guide decision-making by assessing evidence, estimating parameters, and deter...


Slide Content

Statistical Inference and Hypothesis Testing Lecturer: Waweru Nyamu

  Introduction to Statistical Inference Statistical inference is the process of drawing conclusions about a population based on sample data It allows researchers to make generalizations from a smaller sample to a larger population Key Concepts: Population:  Entire group of interest Sample:  Subset of the population Parameter:  Numerical characteristic of a population Statistic:  Numerical characteristic of a sample Example in Health Research: If we measure the average blood pressure of 100 hypertensive patients, we can infer the average blood pressure of all hypertensive patients in the population

Test of a Hypothesis A procedure leading to a decision about a particular hypothesis Hypothesis-testing procedures rely on using the information in a random sample from the population of interest If this information is consistent with the hypothesis, then we will conclude that the hypothesis is true; if this information is inconsistent with the hypothesis, we will conclude that the hypothesis is false What is a Statistical Testing? Also called “hypothesis testing” … the process of inferring from your data whether an observed difference is likely to represent chance variation or a real difference (Does NOT address bias, confounding, or investigator error!) For two-by-two table data, influenced by: Number of subjects or observations in study Size of difference in results between groups

Hypothesis testing Statistical hypothesis testing and confidence interval estimation of parameters are the fundamental methods used at the data analysis stage of a comparative experiment, in which the engineer is interested, for example, in comparing the mean of a population to a specified value Statistical Hypotheses For example, suppose that we are interested in the burning rate of a solid propellant used to power aircrew escape systems Now burning rate is a random variable that can be described by a probability distribution Suppose that our interest focuses on the mean burning rate (a parameter of this distribution) Specifically, we are interested in deciding whether or not the mean burning rate is 50 centimeters per second

Step-by-Step Hypothesis Testing Procedure State Hypotheses H₀: μ₁ = μ₂ ( no difference) H₁: μ₁ ≠ μ₂ ( difference exists) Choose Significance Level Typically α = 0.05 Select Appropriate Test t-test, z-test, chi-square, etc. Compute Test Statistic Use appropriate formula Determine p-value Probability of observed results if H₀ is true Make Decision Reject H₀ if p ≤ α Draw Conclusion In context of research question

 Types of Statistical Inference Type Purpose Example Estimation Estimate population parameters using sample statistics Estimating average hospital stay duration Hypothesis Testing Test assumptions about population parameters Testing if new drug reduces blood pressure more than placebo

Hypothesis Null Hypothesis (H₀) A null hypothesis is a statement of the status quo, one of no difference or no effect. If the null hypothesis is not rejected, no changes will be made The null hypothesis refers to a specified value of the population parameter (e.g., ), not a sample statistic (e.g., ) Statement of "no effect" or "no difference“ Assumes any observed difference is due to chance The hypothesis we test against A null hypothesis may be rejected, but it can never be accepted based on a single test Alternative Hypothesis (H₁ or Ha) An alternative hypothesis is one in which some difference or effect is expected Statement contradicting the null hypothesis What the researcher wants to prove Suggests a real effect or difference

Hypothesis Testing Framework A statistical method that uses sample data to evaluate a hypothesis about a population parameter Steps in Hypothesis Testing: State the Hypotheses Set Significance Level (α) Select Appropriate Test Compute Test Statistic Make Decision Draw Conclusion

Examples in Health Research Research Question Null Hypothesis (H₀) Alternative Hypothesis (H₁) Does new drug reduce BP? Drug has no effect on BP Drug reduces BP Is prevalence different between genders? Prevalence is equal Prevalence differs Does exercise improve outcomes? Exercise has no effect Exercise improves outcomes Directional vs. Non-directional: Two-tailed:  H₁: μ₁ ≠ μ₂ (difference in either direction) One-tailed:  H₁: μ₁ > μ₂ or H₁: μ₁ < μ₂ (specific direction)

Types of Errors in Hypothesis Testing Error Type Definition Probability Consequence in Health Context Type I Error ( α) Rejecting a true H₀ (False Positive) α ( usually 0.05) Concluding treatment works when it doesn't Type II Error ( β) Failing to reject a false H₀ (False Negative) β Missing a truly effective treatment Relationship: Decreasing α increases β, and vice versa Larger sample size reduces both errors

Type 1 error (False Positive) Type I error = null hypothesis rejected, but it is in fact true Rejecting the null hypothesis when it is actually true Symbol : α (alpha) α = arbitrary cut-off level of the P-value (e.g., 0.05) If P-value < α, reject the null hypothesis Probability : Significance level of the test But P-value is probability of observing a difference as great or greater than the observed difference, if the null hypothesis were true Concept : Finding an effect that isn't really there Thus, the likelihood of erroneously rejecting the null hypothesis is 5% H₀ : Patient does not have disease Type I Error : False positive - healthy person diagnosed as sick

Type 11 error (False Negative) Type II error null hypothesis accepted, but it is actually false  Failing to reject the null hypothesis when the alternative is true Symbol : β (beta) Concept : Ability to detect a real effect Missing a real effect Type II Error : False negative - sick person diagnosed as healthy Power of a Test Probability of correctly rejecting a false null hypothesis Formula : (1- β) = power of a test Power = 1 - β Power = probability of rejecting the null hypothesis when it in fact is false e.g., power = probability of rejecting H0 (assuming that HA is true in underlying population) Traditionally, β set at 0.2, so Power = 80%

Medical context False Positive (Type I Error) Situation:  Healthy person diagnosed with disease Implications: Psychological trauma:  Anxiety, depression, fear Unnecessary treatments:  Chemotherapy, surgery, medications Physical harm:  Treatment side effects, surgical complications Financial costs:  Medical bills, lost wages Social stigma:  Changed relationships, insurance issues Example:  A false positive cancer diagnosis leads to unnecessary chemotherapy, causing permanent organ damage. False Negative (Type II Error) Situation:  Sick person told they're healthy Implications: Delayed treatment:  Disease progression, reduced survival chances False reassurance:  Person ignores worsening symptoms Increased treatment complexity:  More aggressive treatments later Public health risk:  Infectious diseases can spread Lost opportunity:  Early intervention window missed Example:  A false negative COVID-19 test leads to an infected person spreading the virus to vulnerable populations.

Scientific Research False Positive Situation:  Finding effect that doesn't exist Implications: Wasted resources:  Other researchers pursue dead ends Misinformation:  False theories enter literature Policy mistakes:  Regulations based on incorrect findings Reputation damage:  For individual researchers and field Public mistrust:  Erosion of confidence in science Example:  False positive in vaccine-autism research led to widespread vaccine hesitancy and disease outbreaks. False Negative Situation:  Missing real discovery Implications: Delayed progress:  Beneficial discoveries postponed Lost opportunities:  Treatments, technologies delayed Career impacts:  Promising research abandoned Knowledge gaps:  Incomplete understanding of phenomena Funding misallocation:  Resources directed away from promising areas Example:  Initial negative results for penicillin almost delayed one of medicine's most important discoveries

The Jury Analogy This classic analogy makes these errors intuitive: Courtroom Scenario: Null Hypothesis (H₀) : Defendant is innocent Alternative Hypothesis (H₁) : Defendant is guilty Test : Trial evidence Decision : Jury verdict The Four Possible Outcomes:

Criminal Justice System False Positive Situation:  Innocent person convicted Implications: Wrongful imprisonment:  Loss of freedom, sometimes for decades Destroyed life:  Career, relationships, reputation ruined Psychological damage:  PTSD, depression, anxiety Financial ruin:  Legal costs, lost income Real perpetrator free:  Continues to commit crimes Example:  DNA evidence later exonerates someone who served 20 years for a crime they didn't commit. False Negative Situation:  Guilty person acquitted Implications: Danger to society:  Criminal remains free to reoffend Lack of closure:  Victims feel justice wasn't served Erosion of trust:  Public loses faith in justice system Deterrent failure:  Weakens crime prevention Multiple victims:  Future crimes could have been prevented Example:  A violent offender is acquitted due to technicalities and goes on to commit more serious crimes.

Level of Significance (α) The probability of making a Type I error — rejecting a true null hypothesis Sometimes the type I error probability is called the significance level, or the α-error, or the size of the test Common Significance Levels: α = 0.05  (5% risk) - Most common in health research α = 0.01  (1% risk) - More conservative α = 0.10  (10% risk) - Less conservative, exploratory studies Interpretation of p-values: p ≤ α:  Reject H₀ (statistically significant) p > α:  Fail to reject H₀ (not statistically significant)

What is an point estimation A  point estimate is a single value used to estimate a population parameter based on sample data It's your "best guess" for a population value based on your sample data Key idea:  You use a sample statistic to estimate an unknown population parameter Estimation is the process of making a quantitative inference about a population from a sample Common Point Estimates Sample Mean (x̄) → Population Mean (μ) Formula:  x̄ = ( Σx ᵢ) / n Use:  Estimates the average value in the population Example:  Average height of 100 students estimates average height of all students Sample Proportion (p̂) → Population Proportion (P) Formula:  p̂ = x / n where x = number of successes Use:  Estimates the proportion of a characteristic in the population Example:  45 out of 200 prefer Brand A (22.5%) estimates the true population preference Sample Variance (s²) → Population Variance (σ²) Formula:  s² = Σ(xᵢ - x̄)² / (n - 1) Use:  Estimates the variability in the population Sample Standard Deviation (s) → Population Standard Deviation (σ) Formula:  s = √[Σ(xᵢ - x̄)² / (n - 1)] Use:  Estimates the spread in the population

What is an Interval Estimate? An  interval estimate  is a range of values used to estimate a population parameter. Unlike a point estimate (a single number), an interval estimate provides: A range of plausible values A measure of precision/uncertainty A confidence level associated with the estimate Key idea:  "We are C% confident that the true population parameter lies between Lower Bound and Upper Bound.“ The Most Common Type: Confidence Intervals Confidence intervals are the most frequently used interval estimates. Confidence Interval Structure Point Estimate ± Margin of Error Where: Point Estimate:  Your best single guess (sample mean, proportion, etc.) Margin of Error:  The amount of uncertainty in your estimate Confidence Level:  The probability that the interval contains the true parameter

Common CIs and Critical Values Confidence Level α ( alpha) α/2 z* (Normal) 90% 0.10 0.05 1.645 95% 0.05 0.025 1.960 99% 0.01 0.005 2.576 Note:  For t-distribution, critical values depend on degrees of freedom

Types of Confidence Intervals Confidence Interval for a Mean When Population Standard Deviation ( σ) is Known Formula:  x̄ ± z* ( σ/√ n) When Population Standard Deviation ( σ) is Unknown (More Common) Formula:  x̄ ± t* (s/√n) Where: x̄  = Sample mean z * = Critical value from standard normal distribution ( + 1.96) t * = Critical value from t-distribution σ  = Population standard deviation (known) s  = Sample standard deviation n  = Sample size

Types of Confidence Intervals Confidence Interval for a Proportion Formula:  p̂ ± z* √[p̂(1-p̂)/n] Where: p̂  = Sample proportion z * = Critical value from standard normal distribution ( + 1.96) n  = Sample size

Step-by-Step Calculation Examples Example 1: CI for Mean (σ Unknown) A sample of 25 students has: Mean test score: x̄ = 82 Standard deviation: s = 12 Confidence level: 95% Step 1: Identify the appropriate distribution σ unknown, n = 25 → Use t-distribution Degrees of freedom = n - 1 = 24 Step 2: Find the critical value For 95% CI, α = 0.05, α/2 = 0.025 t* for df =24, α/2=0.025 is 2.064 Step 3: Calculate standard error SE = s/√n = 12/√25 = 12/5 = 2.4 Step 4: Calculate margin of error ME = t* × SE = 2.064 × 2.4 ≈ 4.95 Step 5: Construct the interval 82 ± 4.95 = (77.05, 86.95) Interpretation:   We are 95% confident that the true mean test score for all students is between 77.05 and 86.95

Step-by-Step Calculation Examples Example 2: CI for Proportion In a survey of 400 voters, 220 support a candidate. Sample proportion: p̂ = 220/400 = 0.55 Confidence level: 95% Step 1: Verify conditions n×p ̂ = 400×0.55 = 220 ≥ 10 n×(1-p̂) = 400×0.45 = 180 ≥ 10 Conditions met for normal approximation Step 2: Find critical value For 90% CI, z* = 1.96 Step 3: Calculate standard error SE = √[p̂(1-p̂)/n] = √[0.55×0.45/400] = √0.00061875 ≈ 0.0249 Step 4: Calculate margin of error ME = z* × SE = 1.96 × 0.0249 ≈ 0.048804 Step 5: Construct the interval 0.55 ± 0.048804 = (0.509, 0.591) Interpretation:   We are 90% confident that the true proportion of voters who support the candidate is between 50.9% and 59.1 %

Standard Error of the Mean The  Standard Error of the Mean (SEM)  measures how much the sample mean (average) of your data is likely to differ from the true population mean If you were to take many different samples from the same population, their means would vary. The SEM quantifies this variation. A   smaller SEM  means your sample mean is likely a more precise estimate of the population mean. A   larger SEM  means there is more uncertainty. It's crucial not to confuse SEM with Standard Deviation (SD): Standard Deviation (SD):   Measures the amount of variation or dispersion of  individual data points  from the mean Standard Error of the Mean (SEM):   Measures the precision of the  sample mean  itself

Standard Error of the Mean The most common formula for the SEM is: Standard Error of the Mean (SEM) =  s / √n Where: s  = Sample Standard Deviation n  = Sample Size (the number of observations in your sample) Why the Square Root of n? As your sample size increases, the estimate of the mean becomes more precise The relationship is governed by the square root: to halve the standard error, you need to quadruple your sample size This is a key insight from the Central Limit Theorem

Step-by-Step Calculation Calculate the Sample Mean (x̄) Add up all the data points in your sample and divide by the number of data points (n ) Calculate the Sample Standard Deviation (s) This measures the spread of your data Find the difference between each data point and the mean Square each of those differences Sum up all the squared differences Divide this sum by (n - 1 ) (This gives the  sample variance , s² ) Take the square root of the result Compute the Standard Error (SEM) Divide the sample standard deviation (s) by the square root of the sample size (n). SEM = s / √n

Example #1 Tests scores: 78 , 85, 92, 88, 75 Step 1: Calculate the Mean (x̄) x̄ = (78 + 85 + 92 + 88 + 75) / 5 = 418 / 5 = 83.6 Step 2: Calculate the Standard Deviation (s) First, find the squared differences from the mean: (78 - 83.6)² = (-5.6)² = 31.36 (85 - 83.6)² = (1.4)² = 1.96 (92 - 83.6)² = (8.4)² = 70.56 (88 - 83.6)² = (4.4)² = 19.36 (75 - 83.6)² = (-8.6)² = 73.96 Sum of squared differences = 31.36 + 1.96 + 70.56 + 19.36 + 73.96 = 197.2 Sample Variance (s²) = 197.2 / (5 - 1) = 197.2 / 4 = 49.3 Standard Deviation (s) = √49.3 ≈ 7.02 Step 3: Calculate the SEM SEM = s / √n = 7.02 / √5 ≈ 7.02 / 2.236 ≈ 3.14 Interpretation:   The standard error of the mean is approximately  3.14 . This means we expect the sample mean of 83.6 to be about 3.14 points away from the true population mean, on average

Use in Confidence Intervals The primary use of the standard error of the proportion is to construct  confidence intervals  for the population proportion 95 % Confidence Interval Formula: p ± 1.96 * SEP For our example: Lower bound: 0.225 - 1.96 * 0.0295 ≈ 0.167 Upper bound: 0.225 + 1.96 * 0.0295 ≈ 0.283 We are 95% confident that the true proportion of people who prefer Brand A coffee in the population is between 16.7% and 28.3 % Maximum Standard Error The standard error is largest when p = 0.5. This is because p(1 - p) is maximized at 0.25 when p = 0.5. This is a useful conservative estimate when planning a study if you don't know what p might be

Summary of SEM Aspect Description Purpose To measure the precision of the sample mean as an estimate of the population mean Formula SEM = s / √n Key Factor Sample Size (n). Increasing n decreases the SEM, giving a more precise estimate Use Case Essential for constructing  confidence intervals  and performing  hypothesis tests  about the population mean

Confidence interval The interval calculated from a random sample by a procedure which, if applied to an infinite number of random samples of the same size, would, in 95% (or other specified level) of instances, contain the true value in population A range of values that quantify the uncertainty around a point estimate of a measure, such as the proportion of children vaccinated or the effect of exposure on disease Provides interval estimate, reflects precision (or imprecision) of the point estimate Examples of point estimates and 95% confidence intervals : Vaccine coverage = 70% (95% CI =65%-75%) The range of values that are compatible with the data under the standard interpretation of statistical significance

Confidence interval “Statistics means never having to say you’re certain!” Relying on information from a sample will always lead to some level of uncertainty Confidence interval is a range of values that tries to quantify this uncertainty: For example , 95% CI means that under repeated sampling 95% of CIs would contain the true population parameter The point estimate and its confidence interval answers the question... "What is the size of that treatment difference?", and "How precisely did this trial determine or estimate the treatment difference?"

Computing confidence intervals (CI) General formula: (Sample statistic) ± [(confidence level) × (measure of how high the sampling variability is)] Sample statistic: observed magnitude of effect or association (e.g., odds ratio, risk ratio) Confidence level: varies – 90%, 95%, 99%.For example, to construct a 95% CI, Zα/2 =1.96 Sampling variability: Standard error (S.E.) of the estimate is a measure of variability Suppose α =0.05, we cannot say: "with probability 0.95 the parameter μ lies in the confidence interval." We only know that by repetition, 95% of the intervals will contain the true population parameter (μ) In 5 % of the cases however it doesn't. And unfortunately we don't know in which of the cases this happens That's why we say: with confidence level 100(1 − α) % μ lies in the confidence interval."

Interpretation of Cls Width of the confidence interval (CI) A narrow CI implies high precision A wide CI implies poor precision (usually due to inadequate sample size) Does the interval contain a value that implies no change or no effect or no association? CI for a difference between two means: Does the interval include 0 (zero)? CI for a ratio ( e.g , OR, RR): Does the interval include 1?

Interpretation of Cls

What P values stand for? ‘P’ stands for probability Tail area probability based on the observed effect as or larger than the observed effect (more extreme in the tails of the distribution), assuming null hypothesis is true Measures the strength of the evidence against the null hypothesis Smaller P values indicate stronger evidence against the null hypothesis Fisher suggested 5% level (p<0.05) could be used as a scientific benchmark for concluding that fairly strong evidence exists against H0 Was never intended as an absolute threshold Strength of evidence is on a continuum Simply noting the magnitude of the P-value should suffice Scientific context is critical By convention, p-values of <.05 are often accepted as “statistically significant” in the medical literature; but this is an arbitrary cut-off

What P values stand for? P<0.05 is an arbitrary cut-point Does it make sense to adopt a therapeutic agent because P-value obtained in a RCT was 0.049, and at the same time ignore results of another therapeutic agent because P-value was 0.051? Hence important to report the exact p-value and not ≤ 0.05 or >0.05 P values give no indication about the clinical importance of the observed association A very large study may result in very small p-value based on a small difference of effect that may not be important when translated into clinical practice Therefore, important to look at the effect size and confidence intervals…

P-values versus Cls P-value answers the question... "Is there a statistically significant difference between the two treatments?“ The point estimate and its confidence interval answers the question... "What is the size of that treatment difference?", and "How precisely did this trial determine or estimate the treatment difference?“

Confidence Intervals A range of values that is likely to contain the true population parameter with a specified level of confidence Formula for 95% CI for Mean: Where: xˉ= sample mean Zα /2​ = 1.96 (for 95% confidence) s= sample standard deviation n = sample size For Proportions:

Computing Confidence Intervals: Example 1: Mean Blood Pressure Sample mean (x ˉ) = 130 mmHg Sample SD (s) = 15 mmHg Sample size (n) = 100 Confidence level = 95 % Interpretation: We are 95% confident that the true population mean blood pressure lies between 127.06 and 132.94 mmHg Example 2: Disease Prevalence Sample proportion (p) = 0.25 (25 %) Sample size (n) = 400 Confidence level = 95 % Interpretation: We are 95% confident that the true disease prevalence in the population is between 20.75% and 29.25%

Interpreting Confidence Intervals Scenario Interpretation CI includes null value Result not statistically significant CI excludes null value Result statistically significant Narrow CI Precise estimate (large sample, low variability) Wide CI Imprecise estimate (small sample, high variability) Relationship with Hypothesis Testing: If 95% CI for difference includes 0 → fail to reject H₀ If 95% CI for difference excludes 0 → reject H₀

Applications in Health Research Application Example Clinical Trials Comparing treatment effects using confidence intervals Epidemiology Estimating disease prevalence with precision Diagnostic Testing Determining accuracy of new diagnostic tools Public Health Estimating vaccination coverage rates Quality Improvement Monitoring hospital performance indicators

Summary Table Concept Definition Key Points Statistical Inference Drawing conclusions about populations from samples Foundation of evidence-based practice Null Hypothesis (H₀) Statement of no effect What we test against Alternative Hypothesis (H₁) Statement of effect What we want to prove Type I Error ( α) False positive Rejecting true H₀ Type II Error ( β) False negative Failing to reject false H₀

Key Takeaways Statistical inference  connects sample data to population conclusions Hypothesis testing  provides a framework for decision-making Type I and II errors  represent the risks of wrong conclusions Confidence intervals  provide more information than p-values alone Proper interpretation  requires understanding both statistical and clinical significance Reporting should include  estimates, confidence intervals, and clear interpretations Always consider  practical significance alongside statistical significance Larger samples  yield more precise estimates and greater powe
Tags