Chapter six Bio (1).pptx for students and others

nagesageshu6 1 views 85 slides Oct 02, 2025
Slide 1
Slide 1 of 85
Slide 1
1
Slide 2
2
Slide 3
3
Slide 4
4
Slide 5
5
Slide 6
6
Slide 7
7
Slide 8
8
Slide 9
9
Slide 10
10
Slide 11
11
Slide 12
12
Slide 13
13
Slide 14
14
Slide 15
15
Slide 16
16
Slide 17
17
Slide 18
18
Slide 19
19
Slide 20
20
Slide 21
21
Slide 22
22
Slide 23
23
Slide 24
24
Slide 25
25
Slide 26
26
Slide 27
27
Slide 28
28
Slide 29
29
Slide 30
30
Slide 31
31
Slide 32
32
Slide 33
33
Slide 34
34
Slide 35
35
Slide 36
36
Slide 37
37
Slide 38
38
Slide 39
39
Slide 40
40
Slide 41
41
Slide 42
42
Slide 43
43
Slide 44
44
Slide 45
45
Slide 46
46
Slide 47
47
Slide 48
48
Slide 49
49
Slide 50
50
Slide 51
51
Slide 52
52
Slide 53
53
Slide 54
54
Slide 55
55
Slide 56
56
Slide 57
57
Slide 58
58
Slide 59
59
Slide 60
60
Slide 61
61
Slide 62
62
Slide 63
63
Slide 64
64
Slide 65
65
Slide 66
66
Slide 67
67
Slide 68
68
Slide 69
69
Slide 70
70
Slide 71
71
Slide 72
72
Slide 73
73
Slide 74
74
Slide 75
75
Slide 76
76
Slide 77
77
Slide 78
78
Slide 79
79
Slide 80
80
Slide 81
81
Slide 82
82
Slide 83
83
Slide 84
84
Slide 85
85

About This Presentation

Educational resources


Slide Content

University of Gondar College of medicine and health science Department of Epidemiology and Biostatistics Chapter Seven: Statistical Inference By: Berhanie Addis (MSc.) Email: [email protected]

Statistical Inference Inference is the process of making interpretations or conclusions from sample data for the totality of the population. In statistics there are two ways inference: Statistical estimation Statistical hypothesis testing. Data analys is is the process of extracting relevant information from the summarized data

Population Sample Numerical data Analyzed Data Inference

Statistical Estimation Point Estimation It is a procedure that results in a single value as an estimate for a parameter. Interval estimation It is the procedure that results in the interval of values as an estimate for a parameter, which is interval that contains the likely values of a parameter.

Definitions Confidence Interval : An interval estimate with a specific level of confidence Confidence Level : The percent of the time that the true value will lie in the interval estimate given. Consistent Estimator : An estimator which gets closer to the value of the parameter as the sample size increases.

Degree of freedom The number of data values which are allowed to vary once a statistic has been determined. Estimator : A sample statistic which is used to estimate a population parameter. It must be unbiased, consistent, and relatively efficient. Estimate : Is the different possible values which an estimator can assumes.

Interval Estimate A range of values used to estimate a parameter. Point Estimate A single value used to estimate a parameter. Relatively Efficient Estimator : The estimator for a parameter with the smallest variance. Unbiased Estimator : An estimator whose expected value is the value of the parameter being estimated.

Point and Interval estimation of the population mean: µ Point Estimation Confidence interval estimation of the population mean There are different cases to be considered to construct confidence intervals.

Example  

Cont…

Case 1: If sample size is large or if the population is normal with known variance - The (1 – α )100% confidence interval for the population mean µ is:

When  is Unknown, and small sample size (n<30) - The (1 – α )100% confidence interval for µ becomes: But usually is not known, in that case we estimate by its point estimator S 2

Case 2: If sample size is small and the population variance is not known .

Cont…

Critical Values of z and Levels of Confidence 5 4 3 2 1 - 1 - 2 - 3 - 4 - 5 . 4 . 3 . 2 . 1 . Z f ( z ) S t a n d a r d N o r m a l D i s t r i b u t i o n

Example 1: From a normal population, a sample of size 25 was randomly drawn and a mean of 32 was found. Given that the population standard deviation is 4.2. Find a) A 95% confidence interval for the population mean. b) A 99% confidence interval for the population mean

Normal Distribution

t-distribution

Example 2 A company that delivers packages within a large metropolitan area claims that it takes an average of 28 minutes for a package to be delivered from your door to the destination. A random sample of 100 packages took a mean time of 31.5 minutes with standard deviation of 5 minutes. Construct a 95% confidence interval for the average delivery times of all packages. (30.52, 32.48)

Example 3 A stock market analyst wants to estimate the average return on a certain stock. A random sample of 15 days yields an average (annualized) return of 10.37% and a standard deviation of 3.5%. Assuming a normal population of returns, give a 95% confidence interval for the average return on this sto ck. (8.43,. 12.31))

Point and interval estimation of population proportion We will now consider the method for estimating the binomial proportion p of successes, that is, the proportion of elements in a population that have a certain characteristic. A logical candidate for a point estimate of the population proportion p is the sample proportion , where x is the number of observations in a sample of size n that have the characteristic of interest. As we have seen in sampling distribution of proportions, the sample proportion is the best point estimate of the population proportion.  

Cont… The shape is approximately normal provided n is sufficiently large - in this case, nP > 5 and nQ > 5 are the requirements for sufficiently large n ( central limit theorem for proportions) . The point estimate for population proportion π is given by þ. A (1- α )100% confidence interval estimate for the unknown population proportion π is given by: CI=

Cont… If the sample size is small, i.e. np < 5 and nq < 5, and the population standard deviations for proportion are not given, then the confidence interval estimation will take t-distribution instead of z as:

Example The mean diastolic blood pressure for 225 randomly selected individuals is 75 mmHg with a standard deviation of 12.0 mmHg. Construct a 95% confidence interval for the mean Solution n=225 mean =75mmhg Standard deviation=12 mmHg confidence level 95% The 95% confidence interval for the unknown population mean is given 95%CI = (75 ±1.96x12/15) = (73.432,76.56)

Example In a survey of 300 automobile drivers in one city, 123 reported that they wear seat belts regularly. Estimate the seat belt rate of the city and 95% confidence interval for true population proportion. Answer : p = 123/300 =0.41=41% n=300, Estimate of the seat belt of the city at 95% CI = p ± z ×(√ p(1-p) /n) =(0.35,0.47)

Hypothesis Testing This is also one way of making inference about population parameter, where the investigator has prior notion about the value of the parameter. Definitions: Statistical hypothesis : is an assertion or statement about the population whose plausibility is to be evaluated on the basis of the sample data.

Test statistic : is a statistics whose value serves to determine whether to reject or accept the hypothesis to be tested. It is a random variable. Statistic test : is a test or procedure used to evaluate a statistical hypothesis and its value depends on sample data.

Cont…

There are two types of hypothesis: Null hypothesis : It is the hypothesis to be tested. It is the hypothesis of equality or the hypothesis of no difference. Usually denoted by H . Alternative hypothesis : It is the hypothesis available when the null hypothesis has to be rejected. It is the hypothesis of difference. Usually denoted by H 1 or H a .

Cont…

The critical value separates the critical region from the noncritical region for a given level of significance

Types and size of errors: Type I error : Rejecting the null hypothesis when it is true. Type II error : Failing to reject the null hypothesis when it is false.

Cont… Type I error is more serious error and it is the level of significant power is the probability of rejecting false null hypothesis and it is given by 1- β

General steps in hypothesis testing: 1) Specify the null hypothesis (H ) and the alternative hypothesis (H 1 ). Specify the significance level , 3) Identify the sampling distribution (if it is Z or t ) of the estimator. Identify the critical region. 5) Calculate a statistic analogous to the parameter specified by the null hypothesis. 6) Making decision. 7) Summarization of the result.

Hypothesis testing about the population mean, :

Examples one : Test the hypotheses that the average height content of containers of certain lubricant is 10 liters if the contents of a random sample of 10 containers are 10.2, 9.7, 10.1, 10.3, 10.1, 9.8, 9.9, 10.4, 10.3, and 9.8 liters. Use the 0.01 level of significance and assume that the distribution of contents is normal.

Example EXAMPLE 5: A researcher claims that the mean of the IQ for 16 students is 110 and the expected value for all population is 100 with standard deviation of 10. Test the hypothesis . Solution Ho:µ=100 VS HA:µ≠100 Assume α =0.05 Test statistics: z=(110-100)4/10=4 z-critical at 0.025 is equal to 1.96. Decision: reject the null hypothesis since 4 ≥ 1.96 Conclusion: the mean of the IQ for all population is different from 100 at 5% level of significance.

Example In the study of childhood abuse in psychiatry patients, brown found that 166 in a sample of 947 patients reported histories of physical or sexual abuse. constructs 95% confidence interval test the hypothesis that the true population proportion is 30%? Solution (a) The 95% CI for P is given by

Cont… To the hypothesis we need to follow the steps Step 1: State the hypothesis Ho: P=Po=0.3 Ha: P≠Po ≠0.3 Step 2: Fix the level of significant ( α =0.05 ) Step 3: Compute the calculated and tabulated value of the test statistic

example The mean life time of a sample of 16 fluorescent light bulbs produced by a company is computed to be 1570 hours. The population standard deviation is 120 hours. Suppose the hypothesized value for the population mean is 1600 hours. Can we conclude that the life time of light bulbs is decreasing?

Statistical inference based on two samples Comparing Two Population Means; Independent Samples: Variances Known Independent Samples: Variances Unknown Paired Difference Experiments Paired/matched/repeated sampling Comparing Two Population Proportions Large, Independent Samples case

Case I: independent samples  

Cont…  

Cont… A (1 –  ) 100% confidence interval for the difference in populations µ 1 –µ 2 is; In testing hypothesis, the z value can then be calculated as;

Cont… The steps to test the hypothesis for difference of means is the same with the single mean Step 1: state the hypothesis H o : µ 1 -µ 2 =0 VS H A : µ 1 -µ 2 ≠0, H A : µ 1 -µ 2 <0, H A : µ 1 -µ 2 >0 Step 2: Significance level ( α ) Step 3: Test statistic

Cont…

Example A researchers wish to know if the data they have collected provide sufficient evidence to indicate a difference in mean serum uric acid levels between normal individual and individual with down’s syndrome. The data consists of serum uric acid readings on 12 individuals with down’s syndrome and 15 normal individuals. The means are 4.5mg/100ml and 3.4 mg/100ml with standard deviation of 2.9 and 3.5 mg/100ml respectively.

Cont…

Independent sample with unknown variance  

Cont… A. Assume that the unknown variances; σ 1 2 = σ 2 2 = σ 2 The pooled estimate of σ 2 is the weighted average of the two sample variances, s 1 2 and s 2 2 The pooled estimate of σ 2 is denoted by s p The estimate of the population standard deviation of the sampling distribution is;

A (1 –  ) 100% CI for µ 1 – µ 2 is; The calculated value of z will be

Cont…  

Cont…  

Paired Sample Rises from two different processes on same study units (e.g. "before” and “after” treatments) or two different processes on paired/matched study units ( e.g. Pair matched case control studies). Use of the same/matched individuals, eliminates any differences in the individuals themselves (confounding factors). Inference concerning the difference between two population means is similar to one population mean; except that we will be manipulating on the di s here.

Cont…  

Cont… If the population of differences is normally distributed with mean  d A (1-  )100% confidence interval for µ d = µ 1 - µ 2 is: Where for a sample of size n , t  /2 is based on n – 1 degrees of freedom. but Z-test can be used if the sample size is large (n1=n2=n > 30).

Example  

Cont…  

Hypothesis testing for two proportion Suppose that n 1 and n 2 are large enough so that; n 1 ·p 1 ≥5, n 1 ·(1 - p 1 )≥5, n 2 ·p 2 ≥5, and n 2 ·(1 – p 2 )≥5 Then the population of all possible values of p ̂ 1 - p̂ 2; Has approximately a normal distribution Has mean µ p ̂1 - p̂2 = p 1 – p 2 Has standard deviation;

Cont… A (1 –  ) 100% confidence interval for p 1 - p 2 ; The test statistic is; where Do = (P 1 -P 2 )0

Cont… To test the hypothesis H o : π 1 - π 2 =0 VS H A : π 1 - π 2 ≠0 The test statistic is given by

Example Example 10: A study was conducted to look at the effects of oral contraceptives (OC) on heart disease in women 40–44 years of age. It is found that among n1 = 500 current OC users, 13 develop a myocardial infarction (MI) over a three-year period, while among n2 = 1000 non-OC users, seven develop a MI over a three-year period. Then; Construct a 95% confidence interval for the difference of MI rates between OC-users and non-users. Can you conclude that rate of MI is significantly greater among OC users? (Report the P-value for your test)

Solution Solution: The estimation (CI) for the difference of population proportions should be formed using the following formula (for a 95% confidence interval): A. Where ≈ 0.005.  The 95% CI for the difference is = (0.012, 0.026)

Test of Association Suppose we have a population consisting of observations having two attributes or qualitative characteristics say A and B. If the attributes are independent then the probability of possessing both A and B is P A *P B­ Where P A is the probability that a number has attribute A. P B is the probability that a number has attribute B. Suppose A has mutually exclusive and exhaustive classes. B has mutually exclusive and exhaustive classes

The chi-square procedure test is used to test the hypothesis of independency of two attributes .

Examples Whether the presence or absence of hypertension is independent of smoking habit or not. b) Whether the size of the family is independent of the level of education attained by the mothers. c) Whether there is association between father and son regarding boldness. d) Whether there is association between stability of marriage and period of acquaintance ship prior to marriage.

Example A geneticist took a random sample of 300 men to study whether there is association between father and son regarding boldness. He obtained the following results.

Conclusion: At 5% level of significance we have evidence to say there is association between father and son regarding boldness, based on this sample data.

Exercise

END Thank you!!!
Tags