correlation and regression

16,612 views 51 slides Oct 17, 2019
Slide 1
Slide 1 of 51
Slide 1
1
Slide 2
2
Slide 3
3
Slide 4
4
Slide 5
5
Slide 6
6
Slide 7
7
Slide 8
8
Slide 9
9
Slide 10
10
Slide 11
11
Slide 12
12
Slide 13
13
Slide 14
14
Slide 15
15
Slide 16
16
Slide 17
17
Slide 18
18
Slide 19
19
Slide 20
20
Slide 21
21
Slide 22
22
Slide 23
23
Slide 24
24
Slide 25
25
Slide 26
26
Slide 27
27
Slide 28
28
Slide 29
29
Slide 30
30
Slide 31
31
Slide 32
32
Slide 33
33
Slide 34
34
Slide 35
35
Slide 36
36
Slide 37
37
Slide 38
38
Slide 39
39
Slide 40
40
Slide 41
41
Slide 42
42
Slide 43
43
Slide 44
44
Slide 45
45
Slide 46
46
Slide 47
47
Slide 48
48
Slide 49
49
Slide 50
50
Slide 51
51

About This Presentation

correlation and regression


Slide Content

1
Correlation and
Regression
BY UNSA SHAKIR

2
Correlation and Regression
Correlationdescribesthestrengthofa
linearrelationshipbetweentwovariables
Regressiontellsushowtodrawthestraight
linedescribedbythecorrelation

3
Correlation and Regression
•Forexample:
Asociologistmaybeinterestedintherelationship
betweeneducationandself-esteemorIncomeand
NumberofChildreninafamily.
Independent Variables
Education
Family Income
Dependent Variables
Self-Esteem
Number of Children

4
Correlation and Regression
•Forexample:
•Mayexpect:Aseducationincreases,self-esteem
increases(positiverelationship).
•Mayexpect:Asfamilyincomeincreases,thenumber
ofchildreninfamiliesdeclines(negativerelationship).
Independent Variables
Education
Family Income
Dependent Variables
Self-Esteem
Number of Children
+
-

5
Correlation

6
Correlation
•Correlationisastatisticaltechniqueusedto
determinethedegreetowhichtwovariables
arerelated
•Acorrelationisarelationshipbetweentwo
variables.Thedatacanberepresentedbythe
orderedpairs(x,y)wherexistheindependent
(orexplanatory)variable,andyisthe
dependent(orresponse)variable.

7
Correlation
x12345
y–4–2–102
Ascatterplotcanbeusedtodetermine
whetheralinear(straightline)correlation
existsbetweentwovariables.
x
2 4
–2
–4
y
2
6
Example:

8
Linear Correlation
x
y
Negative Linear Correlation
x
y
No Correlation
x
y
Positive Linear Correlation
x
y
Nonlinear Correlation
Asxincreases,
ytendsto
decrease.
Asxincreases,
ytendsto
increase.

9
Correlation Coefficient
•ItisalsocalledPearson'scorrelationor
productmomentcorrelationcoefficient
•Thecorrelationcoefficientisameasureof
thestrengthandthedirectionofalinear
relationshipbetweentwovariables.The
symbolrrepresentsthesamplecorrelation
coefficient.Theformulaforris
 
2222
.
n xy x y
r
n x x n y y
   

     

10
Thesignofrdenotesthenatureof
association
whilethevalueofrdenotesthestrengthof
association.

11
Ifthesignis+vethismeanstherelationis
direct(anincreaseinonevariableis
associatedwithanincreaseinthe
othervariableandadecreaseinone
variableisassociatedwitha
decreaseintheothervariable).
Whileifthesignis-vethismeansan
inverseorindirectrelationship(which
meansanincreaseinonevariableis
associatedwithadecreaseintheother).

12
The value of r ranges between ( -1) and ( +1)
The value of r denotes the strength of the
association as illustrated
by the following diagram.
-1 10-0.25-0.75 0.750.25
strong strongintermediate intermediateweakweak
no
relation
perfect
correlation
perfect
correlation
Directindirect

13
Ifr=Zerothismeansnoassociationor
correlationbetweenthetwovariables.
If0<r<0.25=weakcorrelation.
If0.25≤r<0.75=intermediatecorrelation.
If0.75≤r<1=strongcorrelation.
Ifr=l=perfectcorrelation.

14
Linear Correlation
x
y
Strong negative correlation
x
y
Weak positive correlation
x
y
Strong positive correlation
x
y
Nonlinear Correlation
r= 0.91 r= 0.88
r= 0.42
r= 0.07

15
Calculating a Correlation Coefficient
 
2222
.
n xy x y
r
n x x n y y
   

     
1.Find the sum of the x-values.
2.Find the sum of the y-values.
Calculating a Correlation Coefficient
In Words In Symbolsx y xy
3.Multiplyeachx-valuebyits
correspondingy-valueandfind
thesum.

16
Calculating a Correlation Coefficient
Calculating a Correlation Coefficient
In Words In Symbols2
x 2
y
4.Squareeachx-valueand
findthesum.
5.Squareeachy-valueand
findthesum.
6.Usethesefivesumsto
calculatethecorrelation
coefficient.

17
Correlation Coefficient
x y xy x
2
y
2
1 –3 –3 1 9
2 –1 –2 4 1
3 0 0 9 0
4 1 4 16 1
5 2 10 25 4
Example:
Calculate the correlation coefficient rfor the following
data.15x 1y  9xy 2
55x 2
15y

18
Correlation Coefficient
 
2222
n xy x y
r
n x x n y y
   

     
Example:
Calculate the correlation coefficient rfor the following
data.

22
5(9) 15 1
5(55) 15 5(15) 1


   60
50 74
 0.986
There is a strong positive linear correlation
between xand y.

19
Correlation Coefficient
Hours,
x
0123355567710
Test score,
y
968582749568768458657550
Example:
Thefollowingdatarepresentsthenumberofhours,12
differentstudentswatchedtelevisionduringthe
weekendandthescoresofeachstudentwhotookatest
thefollowingMonday.
a.)Displaythescatterplot.
b.)Calculatethecorrelationcoefficientr.

20
Correlation Coefficient
100
x
y
Hours watching TV
Test score
80
60
40
20
246 810
Hours,
x
0123355567710
Test score,
y
968582749568768458657550

21
Correlation Coefficient
Hours, x0123355567710
Test
score, y
968582749568768458657550
xy 085
16
4
222
28
5
34
0
38
0
420348
45
5
52
5
50
0
x
2
01499252525364949
10
0
y
2
921
6
722
5
67
24
547
6
90
25
46
24
57
76
705
6
336
4
42
25
56
25
25
00
Example continued:54x 908y 3724xy 2
332x 2
70836y

22
Correlation Coefficient
Example continued:
 
2222
n xy x y
r
n x x n y y
   

      

22
12(3724) 54 908
12(332) 54 12(70836) 908


 0.831
•There is a strong negative linear correlation.
•AsthenumberofhoursspentwatchingTVincreases,
thetestscorestendtodecrease.

23
Example:
Asampleof6childrenwasselected,dataabouttheir
ageinyearsandweightinkilogramswasrecorded
asshowninthefollowingtable.Itisrequiredtofind
thecorrelationbetweenageandweight.
Weight
(Kg)
Age
(years)
serial
No
1271
862
1283
1054
1165
1396

24
Y
2
X
2
xy
Weight
(Kg)
(y)
Age
(year)
(x)
Serial
n.
14449841271
643648862
14464961283
10025501054
12136661165
169811171396
∑y2=
742
∑x2=
291
∑xy=
461
∑y=
66
∑x=
41
Total

25
r = 0.759
strong direct correlation 
















6
(66)
742.
6
(41)
291
6
6641
461
r
22

26
EXAMPLE: Relationship between Anxiety and Test
Scores
Anxiety
(X)
Test
score (Y)
X
2
Y
2
XY
10 2 100 4 20
8 3 64 9 24
2 9 4 81 18
1 7 1 49 7
5 6 25 36 30
6 5 36 25 30
∑X = 32∑Y = 32∑X
2
= 230∑Y
2
= 204∑XY=129

27
Calculating Correlation Coefficient  
94.
)200)(356(
1024774
32)204(632)230(6
)32)(32()129)(6(
22





r
r = -0.94
Indirect strong correlation

28
Example
Tree
Height
Trunk
Diameter
y x xy y
2
x
2
35 8 280 1225 64
49 9 441 2401 81
27 7 189 729 49
33 6 198 1089 36
60 13 780 3600 169
21 7 147 441 49
45 11 495 2025 121
51 12 612 2601 144
Σ =321 Σ =73 Σ =3142Σ =14111Σ =713

29
13
0
10
20
30
40
50
60
70
0 2 4 6 8 10 12 14
2
Trunk Diameter, x
Tree
Height,
y
Example
•r=0.886→relatively
strongpositivelinear
associationbetweenx
andy

30

31
Regression

32
Regression Analyses
•Regressiontechniqueisconcernedwith
predictingsomevariablesbyknowingothers
•TheprocessofpredictingvariableYusing
variableX

33
20
Types of Regression Models
Positive Linear Relationship
Negative Linear Relationship
Relationship NOT Linear
No Relationship

34
Regression
Usesavariable(x)topredictsomeoutcome
variable(y)
Tellsyouhowvaluesinychangeasa
functionofchangesinvaluesofx

35
Theregressionlinemakesthesumofthesquaresofthe
residualssmallerthanforanyotherline
Regressionminimizesresiduals80
100
120
140
160
180
200
220
60 70 80 90 100 110 120
Wt (kg)
SBP(mmHg)

36
Byusingtheleastsquaresmethod(aprocedurethat
minimizestheverticaldeviationsofplottedpoints
surrounding a straightline)we are
abletoconstructabestfittingstraightlinetothescatter
diagrampointsandthenformulatearegressionequation
intheformof:






n
x)(
x
n
yx
xy
b
2
2
1 )xb(xyyˆ  bXayˆ
Regressionequationdescribestheregressionline
mathematicallybyshowingInterceptandSlope

37
Correlation and Regression
•Thestatisticsequationforaline:
Y=a+bx
Where: Y=theline’spositiononthe
verticalaxisatanypoint(estimated
valueofdependentvariable)
X=theline’spositiononthe
horizontalaxisatanypoint(valueof
theindependentvariableforwhichyou
wantanestimateofY)
b=theslopeoftheline
(calledthecoefficient)
a=theinterceptwiththeYaxis,
whereXequalszero
^
^

38
Linear EquationsY
Y = bX + a
a = Y-intercept
X
Change
in Y
Change in X
b = Slope

39
Exercise
Asampleof6personswasselectedthevalueof
theirage(xvariable)andtheirweightis
demonstratedinthefollowingtable.Findthe
regressionequationandwhatisthepredicted
weightwhenageis8.5years.
Weight (y)Age (x)Serial no.
12
8
12
10
11
13
7
6
8
5
6
9
1
2
3
4
5
6

40
Answer
Y
2
X
2
xyWeight (y)Age (x)Serial no.
144
64
144
100
121
169
49
36
64
25
36
81
84
48
96
50
66
117
12
8
12
10
11
13
7
6
8
5
6
9
1
2
3
4
5
6
7422914616641Total

416.83
6
41
x  11
6
66
y 92.0
6
)41(
291
6
6641
461
2




b
Regression equation6.83)0.9(x11yˆ
(x) 

420.92x4.675yˆ
(x)  12.50Kg8.5*0.924.675yˆ
(8.5)  Kg58.117.5*0.924.675yˆ
(7.5) 

43
we create a regression line by plotting two estimated
values for y against their X component, then extending
the line right and left.

44
Regression Line
Example:
a.)Findtheequationoftheregressionline.
b.)Usetheequationtofindtheexpectedvaluewhen
valueofxis2.3
x y xy x
2
y
2
1 –3 –3 1 9
2 –1 –2 4 1
3 0 0 9 0
4 1 4 16 1
5 2 10 25 415x 1y  9xy 2
55x 2
15y

45
Regression Line
2
x
y
1
1
2
3
123 4 5

22
n xy x y
m
n x x
   

   

2
5(9) 15 1
5(55) 15


 60
50
 1.2

46
Regression Line
Example:
Thefollowingdatarepresentsthenumberofhours12
differentstudentswatchedtelevisionduringthe
weekendandthescoresofeachstudentwhotooka
testthefollowingMonday.
a.)Findtheequationoftheregressionline.
b.)Usetheequationtofindtheexpectedtestscore
forastudentwhowatches9hoursofTV.

47
Regression Line
Hours, x0123355567710
Test score,
y
968582749568768458657550
xy 085164222285340380420348455525500
x
2
01499252525364949100
y
2
9216
722
5
672
4
547
6
902
5
462
4
577
6
705
6
336
4
422
5
562
5
250
054x 908y 3724xy 2
332x 2
70836y

48
•Findthecorrelationbetweenageandblood
pressureusingsimpleandSpearman's
correlationcoefficients,andcomment.
•Findtheregressionequation?
•Whatisthepredictedbloodpressurefora
managing25years?
Exercise

49
x2xyyxSerial
4002400120201
18495504128432
39698883141633
6763276126264
28097102134535
9613968128316
33647888136587
21166072132468
33648120140589
4900100801447010

50
x2xyyxSerial
211658881284611
280972081365312
360087601466013
40024801242014
396990091436315
184955901304316
67632241242617
36122991211918
96139061263119
52928291232320
416781144862630852Total

51






n
x)(
x
n
yx
xy
b
2
2
1 4547.0
20
852
41678
20
2630852
114486
2




=
=112.13 + 0.4547 x
forage25
B.P = 112.13 + 0.4547 * 25=123.49
= 123.5 mm hg yˆ
Tags