Test of Independence The goal is to find whether two observed characteristics of a member of a population are independent. Contingency Table -a table for determining whether the distribution according to one variable is contingent on the distribution of the other. -is made up of R rows and C columns. -is designated as R x C table
Suppose we pick a sample size n and classify the data in a two-way table on the basis of the two variables. Column1 Column 2 Column3 Row 1 C 11 C 12 C 13 Row 2 C 21 C 22 C 23
The chi-square independence test can also be used to test the independence of two variables. The hypothesis are stated as follows: H : The first variable is independent of the second variable. H a : The first variable is dependent of the second variable.
The degrees of freedom for any contingency table are (no. of rows-1) x (no. of columns-1) or v = ( R-1) (C-1) If x 2 >x a 2 with v=(r-1)(c-1) degrees of freedom, reject the null hypothesis of independence at the a level of significance; otherwise, accept the null hypothesis.
Computation of Expected Frequencies Find the sum of each row and each column, and find the grand total. For each cell, multiply the corresponding row sum by the column sum and divide by the grand total, to get the expected value. Place the expected values in the corresponding cells along with the observed values.
The general formula for obtaining the expected frequency of any cell is given by: Expected frequency =(column total)x (row total) grand total
Example A study is being conducted to determine whether there is a relationship between jogging and blood pressure. A random sample of 210 subjects is selected, and they are classified as shown in the table. Use α = 0.05 . Blood Pressure Jogging status Low Moderate High total joggers 34 57 21 112 non joggers 15 63 20 98 total 49 120 41 210
Solution Step 1 H : The blood pressure of a person does not depend on whether he jogs or not. H a : The blood pressure of a person depends on whether he jogs or not. Step 2 Determine the critical value. v = ( R-1) (C-1) = (2-1) (3-1) = (1) (2) v = 2 Using the table, the critical value is 5.991.
Step 3 Decision Rule: Reject H if χ 2 > 5.991
Step 4 Compute the test statistic. First, compute for the expected values E 11 =(112)(49) E 22 =(98) (120) 210 210 E 21 =(98)(49) E 13 =(112) (41) 210 210 E 12 =(112)(120) E 23 =(98) (41) 210 210 = 26.13 = 22.87 = 64 = 19.13 = 21.87 = 56
The expected frequency for each cell is recorded in parentheses beside the actual observed value. Note that the expected frequencies in any row or column add up to the appropriate marginal total. In our example we may compute only the three expected frequencies in the top row of the table and then find the others by subtraction.
Tabulate the summary of the results Blood Pressure Jogging status Low Moderate High total joggers 34 (26.13) 57 (64) 21 (21.87) 112 non joggers 15 (22.87) 63 (56) 20 (19.13) 98 total 49 120 41 210
The chi-square value is χ 2 = (34-26.13) 2 + (15-22.87) 2 + (57-64) 2 26.13 22.87 64 + (63-56) 2 + (21-21.87) 2 + (20-19.13) 2 56 21.87 125.8 =6.79 Χ 2 = ( o i - e i ) 2 e i Σ i
Step 5 Make the decision Since 6.79 > 5.991, reject H . Hence, the blood pressure of a person depends on whether he jogs or not.
Hope You’ve Learned
Seatwork A researcher wishes to see of the way people obtain information is independent of their educational background. A survey of 401 high school and college graduates yielded the following information. At α =0.05 , can one conclude that the way people obtain information is independent of their educational background? H & H a 2. Degrees of freedom and critical value 7-9.compute for χ 2 10. Decision Television Newspapers Other sources Total High School 159 (139.15) 90 (99.50) 51 ( 5. ) 300 College 27 ( 4. ) 43 ( 6. ) 31( 20.65) 101 Total 186 133 82 3.
Answers H : The way people obtain information is independent of their educational background. H a : The way people obtain information is dependent of their educational background. 2. n=2, a =5.991 10. since χ 2 > χ a 2 , 3. 401 reject H0 4. 46.85 5. 61.35 6. 33.50 7-9. χ 2= 46.19
Quiz A public opinion poll surveyed a simple random sample of 1000 voters. Respondents were classified by gender (male or female) and by voting preference (Republican, Democrat, or Independent). Results are shown in the contingency table below. Voting Preferences Republican Democrat Independent Row Total Male 200 150 50 400 Female 250 300 50 600 Column total 450 450 100 1000 Do the men's voting preferences differ significantly from the women's preferences? Use a 0.05 level of significance.