2
Correlation and Regression
Correlationdescribesthestrengthofa
linearrelationshipbetweentwovariables
Regressiontellsushowtodrawthestraight
linedescribedbythecorrelation
3
Correlation and Regression
•Forexample:
Asociologistmaybeinterestedintherelationship
betweeneducationandself-esteemorIncomeand
NumberofChildreninafamily.
Independent Variables
Education
Family Income
Dependent Variables
Self-Esteem
Number of Children
4
Correlation and Regression
•Forexample:
•Mayexpect:Aseducationincreases,self-esteem
increases(positiverelationship).
•Mayexpect:Asfamilyincomeincreases,thenumber
ofchildreninfamiliesdeclines(negativerelationship).
Independent Variables
Education
Family Income
Dependent Variables
Self-Esteem
Number of Children
+
-
7
Correlation
x12345
y–4–2–102
Ascatterplotcanbeusedtodetermine
whetheralinear(straightline)correlation
existsbetweentwovariables.
x
2 4
–2
–4
y
2
6
Example:
8
Linear Correlation
x
y
Negative Linear Correlation
x
y
No Correlation
x
y
Positive Linear Correlation
x
y
Nonlinear Correlation
Asxincreases,
ytendsto
decrease.
Asxincreases,
ytendsto
increase.
9
Correlation Coefficient
•ItisalsocalledPearson'scorrelationor
productmomentcorrelationcoefficient
•Thecorrelationcoefficientisameasureof
thestrengthandthedirectionofalinear
relationshipbetweentwovariables.The
symbolrrepresentsthesamplecorrelation
coefficient.Theformulaforris
2222
.
n xy x y
r
n x x n y y
10
Thesignofrdenotesthenatureof
association
whilethevalueofrdenotesthestrengthof
association.
12
The value of r ranges between ( -1) and ( +1)
The value of r denotes the strength of the
association as illustrated
by the following diagram.
-1 10-0.25-0.75 0.750.25
strong strongintermediate intermediateweakweak
no
relation
perfect
correlation
perfect
correlation
Directindirect
14
Linear Correlation
x
y
Strong negative correlation
x
y
Weak positive correlation
x
y
Strong positive correlation
x
y
Nonlinear Correlation
r= 0.91 r= 0.88
r= 0.42
r= 0.07
15
Calculating a Correlation Coefficient
2222
.
n xy x y
r
n x x n y y
1.Find the sum of the x-values.
2.Find the sum of the y-values.
Calculating a Correlation Coefficient
In Words In Symbolsx y xy
3.Multiplyeachx-valuebyits
correspondingy-valueandfind
thesum.
16
Calculating a Correlation Coefficient
Calculating a Correlation Coefficient
In Words In Symbols2
x 2
y
4.Squareeachx-valueand
findthesum.
5.Squareeachy-valueand
findthesum.
6.Usethesefivesumsto
calculatethecorrelation
coefficient.
17
Correlation Coefficient
x y xy x
2
y
2
1 –3 –3 1 9
2 –1 –2 4 1
3 0 0 9 0
4 1 4 16 1
5 2 10 25 4
Example:
Calculate the correlation coefficient rfor the following
data.15x 1y 9xy 2
55x 2
15y
18
Correlation Coefficient
2222
n xy x y
r
n x x n y y
Example:
Calculate the correlation coefficient rfor the following
data.
22
5(9) 15 1
5(55) 15 5(15) 1
60
50 74
0.986
There is a strong positive linear correlation
between xand y.
19
Correlation Coefficient
Hours,
x
0123355567710
Test score,
y
968582749568768458657550
Example:
Thefollowingdatarepresentsthenumberofhours,12
differentstudentswatchedtelevisionduringthe
weekendandthescoresofeachstudentwhotookatest
thefollowingMonday.
a.)Displaythescatterplot.
b.)Calculatethecorrelationcoefficientr.
20
Correlation Coefficient
100
x
y
Hours watching TV
Test score
80
60
40
20
246 810
Hours,
x
0123355567710
Test score,
y
968582749568768458657550
22
Correlation Coefficient
Example continued:
2222
n xy x y
r
n x x n y y
22
12(3724) 54 908
12(332) 54 12(70836) 908
0.831
•There is a strong negative linear correlation.
•AsthenumberofhoursspentwatchingTVincreases,
thetestscorestendtodecrease.
23
Example:
Asampleof6childrenwasselected,dataabouttheir
ageinyearsandweightinkilogramswasrecorded
asshowninthefollowingtable.Itisrequiredtofind
thecorrelationbetweenageandweight.
Weight
(Kg)
Age
(years)
serial
No
1271
862
1283
1054
1165
1396
24
Y
2
X
2
xy
Weight
(Kg)
(y)
Age
(year)
(x)
Serial
n.
14449841271
643648862
14464961283
10025501054
12136661165
169811171396
∑y2=
742
∑x2=
291
∑xy=
461
∑y=
66
∑x=
41
Total