Chi-Square test for independence of attributes / Chi-Square test for checking association between two categorical variables, Chi-Square test for goodness of fit
Size: 1.82 MB
Language: en
Added: Jan 24, 2022
Slides: 29 pages
Slide Content
CHI-SQUARE TESTS
Chi- Square Statistic If then and with 1 d.f. and with n d.f.
Chi- Square test There are three types of chi-square tests. A Chi-square goodness of fit test determines if distribution of sample data matches with the distribution of population. A Chi-square test for independence is to test whether two categorical variables differ from each another. A Chi-square test for variance of single sample is to test whether there is significant difference between sample variance and population variance
Chi- Square test – Independence of Attributes Chi-Square test is a statistical method to determine if two categorical variables have a significant correlation/association between them.
To test independence of attributes Step 1: H0: Two attributes (categorical variables) are independent Step 2: H1: Two attributes (categorical variables) are dependent Step 3: Test statistics: where Step 4: Conclusion If p Level of significance ( ), We Reject Null hypothesis If p Level of significance ( ), We fail to Reject Null hypothesis [ Note : In 2*2 contingency table if expected is less than 5, use Yate’s correction i.e. Continuity correction in the SPSS output]
Example A public opinion poll surveyed a simple random sample of 1000 voters. Respondents were classified by gender (male or female) and by voting preference (Party A, Party B, or Party C). Results are shown in the contingency table below. Is voting preference affected by gender? Gender Voting Preference Party A Party B Party C Male 200 150 50 Female 250 300 50
Null & Alternative Hypothesis Step 1: H0 : Gender and voting preferences are independent. Step 2: H1 : Gender and voting preferences are not independent.
Data in SPSS Gender Voting Preference Party A Party B Party C Male 200 150 50 Female 250 300 50
Assigning weights
Chi-Square test
Chi-Square test
Chi-Square test
Output The Chi-square value is 16.204 and p value = 0.000 < 0.05. We reject the Null hypothesis. Gender and voting preferences are not independent. i.e. voting preference is affected by gender
To test goodness of fit Step 1: H0: There is no significant difference between observed and expected frequencies Step 2: H1:There is significant difference between observed and expected frequencies Step 3: Test statistics: and p value Step 4: Conclusion If p Level of significance ( ), We Reject Null hypothesis If p Level of significance ( ), We fail to Reject Null hypothesis
Example1 The number of road accidents on a particular highway during a week is given below. Can it be concluded that the proportion of accidents are equal for all days? Day Mon Tue Wed Thu Fri Sat Sun Accidents 14 16 8 12 11 9 14
Hypothesis Null Hypothesis : H0: Proportion of accidents are equal for all days Alternative Hypothesis : H1: Proportion of accidents are not equal for all days
Day Accidents Mon 14 Tue 16 Wed 8 Thu 12 Fri 11 Sat 9 Sun 14
Chi-Square test
Chi-Square test
Output Chi-Square statistics = 4.167 p value = 0.657 > 0.05 Ho is accepted So, it can be concluded that the proportion of accidents are equal for all days
Example 2 Suppose it was suspected an unusual distribution of blood groups in patients undergoing one type of surgical procedure. It is known that the expected distribution for the population served by the hospital which performs this surgery is 44% group O, 45% group A, 8% group B and 3% group AB. A random sample of 187 routine pre-operative blood grouping results are given below. Do this sample match with the expected distribution. Blood Group O A B AB Patients 67 83 29 8 Results for 187 consecutive patients:
Hypothesis Null Hypothesis : H0: There is no significant difference between observed and expected distribution of patients with respect to blood group [Sample follows expected distribution] Alternative Hypothesis : H1: There is significant difference between observed and expected distribution of patients with respect to blood group [Sample does not follows expected distribution]
Case Study 1 Blood Group Observed freq. O 67 A 83 B 29 AB 8 Total N = 187
Case Study 1 Expected distribution for the population served by the hospital which performs this surgery is 44% group O, 45% group A, 8% group B and 3% group AB. Blood Group Observed freq. Probability Expected freq. = N*Prob O 67 0.44 187*0.44 = 82.28 A 83 0.45 187*0.45 = 84.15 B 29 0.08 187*0.08 = 14.96 AB 8 0.03 187*0.03 = 5.61 Total N = 187 1 187
Chi-Square test
Chi-Square test
Chi-Square test Blood Group Observed freq. Probability Expected freq. = N*Prob O 67 0.44 187*0.44 = 82.28 A 83 0.45 187*0.45 = 84.15 B 29 0.08 187*0.08 = 14.96 AB 8 0.03 187*0.03 = 5.61 Total N = 187 1 187
Output Chi-Square statistics = 17.048 p value = 0.001 < 0.05 Ho is rejected So, there is significant difference between observed and expected distribution of patients with respect to blood group.
THANK YOU Dr Parag Shah | M.Sc., M.Phil., Ph.D. ( Statistics) [email protected] www.paragstatistics.wordpress.com