Lecture11 spearman rank correlation part-2-with tied ranks

1,719 views 28 slides Nov 12, 2021
Slide 1
Slide 1 of 28
Slide 1
1
Slide 2
2
Slide 3
3
Slide 4
4
Slide 5
5
Slide 6
6
Slide 7
7
Slide 8
8
Slide 9
9
Slide 10
10
Slide 11
11
Slide 12
12
Slide 13
13
Slide 14
14
Slide 15
15
Slide 16
16
Slide 17
17
Slide 18
18
Slide 19
19
Slide 20
20
Slide 21
21
Slide 22
22
Slide 23
23
Slide 24
24
Slide 25
25
Slide 26
26
Slide 27
27
Slide 28
28

About This Presentation

In this session, we will discuss, how to calculate Spearman's correlation when two or more ranks are the same.
We have considered multiple situations, various permutations and combinations to clarify the concept.


Slide Content

Spearman’s Rank order Correlation Part-2: What happens when two or more ranks are same? Dr. Rajeev Kumar M.S.W., (TISS, Mumbai ), M. P hil ., (CIP, Ranchi ), UGC-JRF, Ph.D . (IIT Kharagpur )

A Case-1: In a college, 10 students obtained marks in Bangla and English. In Bangla 3 students obtained 94 marks and two students received 52 marks. In English, two students obtained 65 and two secured 52 marks. Here we will learn, how to calculate the Spearman correlation, when marks and ranks are similar. We also want to see if the proficiency in English and Bangla are associated with each other. Name of contestants Marks in Bangla (x) Marks in English (y) Ramesh 52 50 Pooja 94 65 Joseph 72 90 Amir 94 52 Sonu 94 91 Rahul 29 52 Manisha 50 49 Santosh 52 65 Pratibha 60 58 Manoj 70 75 ©Dr.Rajeev Kumar 2020 28-10-2021

Making hypotheses Alternative hypothesis Ha: There will be a significant correlation between the marks in Bangla and the marks in English. Null hypothesis H0: There will be no significant correlation between the marks in Bangla and the marks in English. 28-10-2021 ©Dr.Rajeev Kumar 2020

Step-1: Obtain the ranks of all marks Name of contestants Marks in Bangla (x) Marks in English (y) Ranks (x)=d1 Revised Ranks (x)=d1 Ranks (y)=d2 Revised Ranks (y)=d2 D= d1- d2 D² Ramesh 52 50 8 9 Pooja 94 65 1 5 Joseph 72 90 4 2 Amir 94 52 2 8 Sonu 94 91 3 1 Rahul 29 52 10 7 Manisha 50 49 9 10 Santosh 52 65 7 4 Pratibha 60 58 6 6 Manoj 70 75 5 3 ©Dr.Rajeev Kumar 2020 28-10-2021

Step 2: calculate the means of similar ranks Similar ranks are: (94, 94, 94) in marks of Bangla. Three students got 94 Calculate the mean of ranks = 1+2+3/ 3 = 6/3 =2 For (94, 94, and 94) the mean rank will be 2 Similarly (52 and 52) and same marks. Two students got 52 Calculate their mean = 8+7/2 15/2= 7.5. In Bangla for 52 marks, the mean ranks will be 7.5 In English (65 and 65) and (52 and 52) are similar marks. Two students got 65 and 52 respectively For 65, their mean rank will be = 4+5/2 = 9/2 = 4.5 For 52 the mean rank will be = 7+8/2 = 15/2 = 7.5 28-10-2021 ©Dr.Rajeev Kumar 2020

Step-3: assign the revised ranks Name of contestants Marks in Bangla (x) Marks in English (y) Ranks (x)=d1 Revised Ranks (x)=d1 Ranks (y)=d2 Revised Ranks (y)=d2 D= (revised) (d1- d2) D² Ramesh 52 50 8 7.5 9 9 Pooja 94 65 1 2 5 4.5 Joseph 72 90 4 4 2 2 Amir 94 52 2 2 8 7.5 Sonu 94 91 3 2 1 1 Rahul 29 52 10 10 7 7.5 Manisha 50 49 9 9 10 10 Santosh 52 65 7 7.5 4 4.5 Pratibha 60 58 6 6 6 6 Manoj 70 75 5 5 3 3 ©Dr.Rajeev Kumar 2020 28-10-2021

Step-4: subtract the revised ranks and obtain ‘D’ Name of contestants Marks in Bangla (x) Marks in English (y) Ranks (x)=d1 Revised Ranks (x)=d1 Ranks (y)=d2 Revised Ranks (y)=d2 D= (revised) (d1- d2) D² Ramesh 52 50 8 7.5 9 9 -1.5 Pooja 94 65 1 2 5 4.5 -2.5 Joseph 72 90 4 4 2 2 2 Amir 94 52 2 2 8 7.5 -5.5 Sonu 94 91 3 2 1 1 1 Rahul 29 52 10 10 7 7.5 2.5 Manisha 50 49 9 9 10 10 -1 Santosh 52 65 7 7.5 4 4.5 3 Pratibha 60 58 6 6 6 6 Manoj 70 75 5 5 3 3 2 ©Dr.Rajeev Kumar 2020 28-10-2021

Step-5: Square the D and obtain the D² and do the summation of D² and obtain Σ D² Name of contestants Marks in Bangla (x) Marks in English (y) Ranks (x)=d1 Revised Ranks (x)=d1 Ranks (y)=d2 Revised Ranks (y)=d2 D= (revised) (d1- d2) D² Ramesh 52 50 8 7.5 9 9 -1.5 2.25 Pooja 94 65 1 2 5 4.5 -2.5 6.25 Joseph 72 90 4 4 2 2 2 4 Amir 94 52 2 2 8 7.5 -5.5 30.25 Sonu 94 91 3 2 1 1 1 1 Rahul 29 52 10 10 7 7.5 2.5 6.25 Manisha 50 49 9 9 10 10 -1 1 Santosh 52 65 7 7.5 4 4.5 3 9 Pratibha 60 58 6 6 6 6 Manoj 70 75 5 5 3 3 2 4 Σ D²=64 ©Dr.Rajeev Kumar 2020 28-10-2021

Step-6: obtain the value of (m1, m2, m3, and m4) Marks in Bangla (94) repeated 3 times, so (m1=3) Marks in Bangla (52) repeated 2 times, so (m2=2) Marks in English (65) repeated 2 times, so (m3=2) Marks in English (52) repeated 2 times, so (m4=2) So here, we have 4 tied ranks 28-10-2021 ©Dr.Rajeev Kumar 2020

The formula for Spearman correlation with tied ranks 28-10-2021 ©Dr.Rajeev Kumar 2020

Step-7: according to the number of tied ranks, expand the equation 28-10-2021 ©Dr.Rajeev Kumar 2020

Step-8: put the values and calculate the spearman correlation 28-10-2021 ©Dr.Rajeev Kumar 2020

Step-9: obtain the coefficient of spearman correlation 28-10-2021 ©Dr.Rajeev Kumar 2020

Step-10: see the table of critical values 28-10-2021 ©Dr.Rajeev Kumar 2020

Step-11: Interpret the result The critical value of Spearman correlation for n=10 and alpha value =0.05 (p≤ 0.05) = 0.648 Our test value of Spearman correlation is 0.59. Our test value is less than the critical value (p≤ 0.05) =0.648. Therefore the coefficient of Spearman correlation is not significant at 0.05 (p≤ 0.05 ). Also, we should not check the value at 0.01. Alternative hypothesis Ha: There will be a significant correlation between the marks in Bangla and the marks in English. Null hypothesis H0: There will be no significant correlation between the marks in Bangla and the marks in English. 28-10-2021 ©Dr.Rajeev Kumar 2020

Step-12: the final conclusion There is no significant correlation between the marks in Bangla and marks in English. It means, it is not necessary that those who are good in Bangla will be good in English also. Therefore the alternative hypothesis is rejected and null hypothesis is accepted In the graph also the points are very far from the best fit line, which shows the weak correlation. 28-10-2021 ©Dr.Rajeev Kumar 2020

A story of dots who are away from the line of best fit Alternative ways of explaining data and graph X axis represent marks in Bangla and Y axis marks in English. Here we can see, Joseph and Rahul scored more in English than Bangla. Joseph and Rahul and displayed along Y axis. Puja and Amir scored higher in Bangla than English, they are also away from line of best fit. Name of contestants Marks in Bangla (x) Marks in English (y) Ranks (x)=d1 Ramesh 52 50 8 Pooja 94 65 1 Joseph 72 90 4 Amir 94 52 2 Sonu 94 91 3 Rahul 29 52 10 Manisha 50 49 9 Santosh 52 65 7 Pratibha 60 58 6 Manoj 70 75 5 ©Dr.Rajeev Kumar 2020 28-10-2021 Joseph Rahul Puja Amir

Situation-2: The alternative way of calculating correlation with tied ranks. Start from step 3. Take the revised ranks as (x) and (y) Name of contestants Marks in Bangla (x) Marks in English (y) Ranks (x)=d1 Revised Ranks (x)=d1 Ranks (y)=d2 Revised Ranks (y)=d2 D= (revised) (d1- d2) D² Ramesh 52 50 8 7.5 9 9 Pooja 94 65 1 2 5 4.5 Joseph 72 90 4 4 2 2 Amir 94 52 2 2 8 7.5 Sonu 94 91 3 2 1 1 Rahul 29 52 10 10 7 7.5 Manisha 50 49 9 9 10 10 Santosh 52 65 7 7.5 4 4.5 Pratibha 60 58 6 6 6 6 Manoj 70 75 5 5 3 3 ©Dr.Rajeev Kumar 2020 28-10-2021

Step-3: Lets try the same correlation with Pearson method. Students X (revised rank of marks in Bangla) (x-xˉ) Y (revised ranks of marks in English) (Y-Yˉ) (x-xˉ) (Y-Yˉ) (x-xˉ)² (Y-Yˉ)² Ramesh 7.5 9 Pooja 2 4.5 Joseph 4 2 Amir 2 7.5 Sonu 2 1 Rahul 10 7.5 Manisha 9 10 Santosh 7.5 4.5 Pratibha 6 6 Manoj 5 3 Xˉ= 5.5 Yˉ=5.5 = 48.75 Σ (x-xˉ) (Y-Yˉ) =80 Σ (x-xˉ)² =81.5 Σ (Y-Yˉ)² 28-10-2021 ©Dr.Rajeev Kumar 2020

Apply the formula of Pearson correlation 28-10-2021 ©Dr.Rajeev Kumar 2020

Situation-4: lets see, what will happen, if we calculate the Pearson’s correlation using (x)= marks in Bangla and (y) = marks in English. The Pearson’s correlation coefficient (r=0.48), which is lower than previous two correlation values. Name of contestants Marks in Bangla (x) Marks in English (y) Ramesh 52 50 Pooja 94 65 Joseph 72 90 Amir 94 52 Sonu 94 91 Rahul 29 52 Manisha 50 49 Santosh 52 65 Pratibha 60 58 Manoj 70 75 ©Dr.Rajeev Kumar 2020 28-10-2021

Situation-4: lets see, what will happen, if there is only tied rank (52 and 52) in marks of B angla. Name of contestants Marks in Bangla (x) Marks in English (y) Ranks (x)=d1 Revised Ranks (x)=d1 Ranks (y)=d2 D= (revised) (d1- d2) D² Ramesh 52 50 8 7.5 9 -1.5 2.25 Pooja 94 65 1 1 5 -4 16 Joseph 72 90 4 4 2 2 4 Amir 95 52 2 2 8 -6 36 Sonu 96 91 3 3 1 2 4 Rahul 29 53 10 10 7 3 9 Manisha 50 49 9 9 10 -1 1 Santosh 52 66 7 7.5 4 3.5 12.25 Pratibha 60 58 6 6 6 Manoj 70 75 5 5 3 2 4 Σ D²= 88.5 ©Dr.Rajeev Kumar 2020 28-10-2021

What does result say? Correlated coefficient decreased from previous situations also. It means, there was no impact of decreasing tied ranks. 28-10-2021 © Dr.Rajeev Kumar 2020

Situation-5: lets see, what will happen, if we remove all the repeated ranks? The calculated value r=0.50 shows that there is no impact of removing all the tied ranks. Name of contestants Marks in Bangla (x) Marks in English (y) Ranks (x)=d1 Ranks (y)=d2 D= (revised) (d1- d2) D² Ramesh 53 50 8 9 -1 1 Pooja 94 65 1 5 -4 16 Joseph 72 90 4 2 2 4 Amir 95 52 2 8 -6 36 Sonu 96 91 3 1 2 4 Rahul 29 53 10 7 3 9 Manisha 50 49 9 10 -1 1 Santosh 52 66 7 4 3 9 Pratibha 60 58 6 6 Manoj 70 75 5 3 2 4 Σ D²= 84 ©Dr.Rajeev Kumar 2020 28-10-2021

The real case study based on the true data. There are two rankings of SAARC countries. HDI ranking conducted by UNDP and corruption rating done by transparency international in 2018 among 176 countries. Countries HDI ranking (X) TI ranking (y) Ranking of ranking (x) d1 Ranking of ranking (y) d2 D= d1-d2 D² India 129 79 6 7 -1 1 Nepal 147 131 3 3 Bhutan 134 27 5 8 -3 9 Bangladesh 135 145 4 2 2 4 Pakistan 152 116 2 4 -2 4 Afghanistan 170 169 1 1 Maldives 104 95 7 5 -2 4 Shri Lanka 71 96 8 6 2 4 ΣD²=26 28-10-2021 ©Dr.Rajeev Kumar 2020

What does result say? The calculated (r=0.70) which is less than the critical value of Spearman’ s at 0.05 (P≤0.05) for n=8 (0.73). Though there is no significant correlation, but it is tend to be significant. There is no statistical significance, but there is practical significance. Developed countries are less corrupt and honest countries are more developed. Therefore, therefore is a positive correlation between honesty and development. 28-10-2021 ©Dr.Rajeev Kumar 2020

What does graph say? The data are little far from the best fit line, but it a rising trend in positive direction. There is further chance that dots can come closure to the line. 28-10-2021 ©Dr.Rajeev Kumar 2020

Thanks for your kind attention. Keep learning. In the next session, we will learn the partial correlation. 28-10-2021 ©Dr.Rajeev Kumar 2020