Statistics and Probability Correlation and Regression

MathewBuera 4 views 41 slides Feb 27, 2025
Slide 1
Slide 1 of 41
Slide 1
1
Slide 2
2
Slide 3
3
Slide 4
4
Slide 5
5
Slide 6
6
Slide 7
7
Slide 8
8
Slide 9
9
Slide 10
10
Slide 11
11
Slide 12
12
Slide 13
13
Slide 14
14
Slide 15
15
Slide 16
16
Slide 17
17
Slide 18
18
Slide 19
19
Slide 20
20
Slide 21
21
Slide 22
22
Slide 23
23
Slide 24
24
Slide 25
25
Slide 26
26
Slide 27
27
Slide 28
28
Slide 29
29
Slide 30
30
Slide 31
31
Slide 32
32
Slide 33
33
Slide 34
34
Slide 35
35
Slide 36
36
Slide 37
37
Slide 38
38
Slide 39
39
Slide 40
40
Slide 41
41

About This Presentation

Statistics Correlation


Slide Content

Larson & Farber, Elementary Statistics: Picturing the World, 3e 1
Correlation and Linear Correlation and Linear
RegressionRegression
Quarter 4Quarter 4
Week 5Week 5

Larson & Farber, Elementary Statistics: Picturing the World, 3e 2

Larson & Farber, Elementary Statistics: Picturing the World, 3e 3
Describe the relationship between the two pictures

Larson & Farber, Elementary Statistics: Picturing the World, 3e 4
Describe the relationship between the two pictures

Larson & Farber, Elementary Statistics: Picturing the World, 3e 5
Describe the relationship between the two pictures

Larson & Farber, Elementary Statistics: Picturing the World, 3e 6
Describe the relationship between the two pictures

Larson & Farber, Elementary Statistics: Picturing the World, 3e 7
Describe the relationship between the two pictures

Correlation

Larson & Farber, Elementary Statistics: Picturing the World, 3e 9
Correlation
A correlation is a relationship between two variables.
The data can be represented by the ordered pairs (x, y)
where x is the independent variable, and y is the
dependent variable.
A scatter plot can be used to
determine whether a linear
(straight line) correlation exists
between two variables.
x
2 4
–2
– 4
y
2
6
x12345
y – 4– 2– 102
Example:

Larson & Farber, Elementary Statistics: Picturing the World, 3e 10
Linear Correlation
x
y
Negative Linear Correlation
x
y
No Correlation
x
y
Positive Linear Correlation
x
y
Nonlinear Correlation
As x increases,
y tends to
decrease.
As x increases,
y tends to
increase.

Larson & Farber, Elementary Statistics: Picturing the World, 3e 11
Correlation Coefficient
The correlation coefficient is a measure of the strength
and the direction of a linear relationship between two
variables. The symbol r represents the sample
correlation coefficient. The formula for r is

 
2 22 2
.
n xy x y
r
n x x n y y
   

     
The range of the correlation coefficient is 1 to 1. If x and
y have a strong positive linear correlation, r is close to 1.
If x and y have a strong negative linear correlation, r is
close to 1. If there is no linear correlation or a weak
linear correlation, r is close to 0.

Larson & Farber, Elementary Statistics: Picturing the World, 3e 12
Interpretation Guideline

Larson & Farber, Elementary Statistics: Picturing the World, 3e 13
Linear Correlation
x
y
Very strong negative correlation
x
y
Moderately positive correlation
x
y
Very strong positive correlation
x
y
Very weak/ Negligible Correlation
r = 0.91 r = 0.88
r = 0.42
r = 0.07

Larson & Farber, Elementary Statistics: Picturing the World, 3e 14
Calculating a Correlation Coefficient
1.Find the sum of the x-values.
2.Find the sum of the y-values.
3.Multiply each x-value by its
corresponding y-value and find the
sum.
4.Square each x-value and find the sum.
5.Square each y-value and find the sum.
6.Use these five sums to calculate
the correlation coefficient.
Continued.
Calculating a Correlation Coefficient
In Words In Symbols
x
y
xy
2
x
2
y

 
2 22 2
.
n xy x y
r
n x x n y y
   

     

Larson & Farber, Elementary Statistics: Picturing the World, 3e 15
Correlation Coefficient
Example:
Calculate the correlation coefficient r for the following data.
x y xy x
2
y
2
1 – 3 – 3 1 9
2 – 1 – 2 4 1
3 0 0 9 0
4 1 4 16 1
5 2 10 25 4
15x  1y  9xy 
2
55x 
2
15y 

 
2 22 2
n xy x y
r
n x x n y y
   

     


22
5(9) 15 1
5(55) 15 5(15) 1
 

  
60
50 74
 0.986
There is a very strong
positive linear correlation
between x and y.

Larson & Farber, Elementary Statistics: Picturing the World, 3e 16
Correlation Coefficient
Hours, x 0123355567710
Test score, y968582749568768458657550
Example:
The following data represents the number of hours 12
different students watched television during the
weekend and the scores of each student who took a test
the following Monday.
a.) Display the scatter plot.
b.) Calculate the correlation coefficient r.
Continued.

Larson & Farber, Elementary Statistics: Picturing the World, 3e 17
Correlation Coefficient
Hours, x 0123355567710
Test score, y968582749568768458657550
Example continued:
100
x
y
Hours watching TV
T
e
s
t

s
c
o
r
e80
60
40
20
246 810
Continued.

Larson & Farber, Elementary Statistics: Picturing the World, 3e 18
Correlation Coefficient
Hours, x 0 1 23 3555 67710
Test score, y968582749568768458657550
xy 085164222285340380420348455525500
x
2
0 1 49 9252525364949100
y
2
921672256724547690254624577670563364422556252500
Example continued:

 
2 22 2
n xy x y
r
n x x n y y
   

     


22
12(3724) 54 908
12(332) 54 12(70836) 908


 
0.831
There is a very strong negative linear correlation.
As the number of hours spent watching TV increases,
the test scores tend to decrease.
54x  908y  3724xy 
2
332x 
2
70836y 

Linear Regression

Larson & Farber, Elementary Statistics: Picturing the World, 3e 20
Regression Line
A regression line, also called a line of best fit, is the line for
which the sum of the squares of the residuals is a minimum.
The Equation of a Regression Line
The equation of a regression line for an independent variable
x and a dependent variable y is
ŷ = mx + b
where ŷ is the predicted y-value for a given x-value. The
slope m and y-intercept b are given by


-
-
22
and
where is the mean of the y values and is the mean of the
values. The regression line always passes through ( , ).
n xy x y y x
m b y mx m
n n
n x x
y x
x x y
     
    
  

Larson & Farber, Elementary Statistics: Picturing the World, 3e 21
Regression Line
Example:
Find the equation of the regression line.
x y xy x
2
y
2
1 – 3 – 3 1 9
2 – 1 – 2 4 1
3 0 0 9 0
4 1 4 16 1
5 2 10 25 4
15x  1y  9xy 
2
55x 
2
15y 


22
n xy x y
m
n x x
   

  


2
5(9) 15 1
5(55) 15
 


60
50
 1.2
Continued.

Larson & Farber, Elementary Statistics: Picturing the World, 3e 22
Regression Line
Example continued:
b y mx 
1 15
(1.2)
5 5

  3.8
The equation of the regression line is
ŷ = 1.2x – 3.8.
2
x
y
1
1
2
3
123 4 5

1
( , ) 3,
5
x y

Larson & Farber, Elementary Statistics: Picturing the World, 3e 23
Regression Line
Example:
The following data represents the number of hours 12
different students watched television during the
weekend and the scores of each student who took a test
the following Monday.
Hours, x 0 1 23 3555 67710
Test score, y968582749568768458657550
xy 085164222285340380420348455525500
x
2
0 1 49 9252525364949100
y
2
921672256724547690254624577670563364422556252500
54x  908y  3724xy 
2
332x 
2
70836y 
a.) Find the equation of the regression line.
b.) Use the equation to find the expected test score
for a student who watches 9 hours of TV.

Larson & Farber, Elementary Statistics: Picturing the World, 3e 24
Regression Line
Example continued:


22
n xy x y
m
n x x
   

  


2
12(3724) 54 908
12(332) 54



4.067
b y mx 
908 54
( 4.067)
12 12
  
93.97
ŷ = –4.07x + 93.97
100
x
y
Hours watching TV
T
e
s
t

s
c
o
r
e80
60
40
20
246 810
  
54 908
( , ) , 4.5,75.7
12 12
x y 
Continued.

Larson & Farber, Elementary Statistics: Picturing the World, 3e 25
Regression Line
Example continued:
Using the equation ŷ = –4.07x + 93.97, we can predict
the test score for a student who watches 9 hours of TV.
= –4.07(9) + 93.97
ŷ = –4.07x + 93.97
= 57.34
A student who watches 9 hours of TV over the weekend
can expect to receive about a 57.34 on Monday’s test.

Larson & Farber, Elementary Statistics: Picturing the World, 3e 26
Predicting y-Values
After finding the equation of the multiple regression line, you
can use the equation to predict y-values over the range of the data.
Example:
The following multiple regression equation can be used to predict
the annual U.S. rice yield (in pounds).
ŷ = 859 + 5.76x
1
+ 3.82x
2
where x
1
is the number of acres planted (in thousands), and x
2
is
the number of acres harvested (in thousands).
(Source: U.S. National Agricultural Statistics Service)
a.) Predict the annual rice yield when x
1 = 2758, and x
2 = 2714.
b.) Predict the annual rice yield when x
1
= 3581, and x
2
= 3021.
Continued.

Larson & Farber, Elementary Statistics: Picturing the World, 3e 27
Predicting y-Values
Example continued:
= 859 + 5.76(2758) + 3.82(2714)
= 27,112.56
a.) ŷ = 859 + 5.76x
1
+ 3.82x
2

The predicted annual rice yield is 27,1125.56 pounds.
= 859 + 5.76(3581) + 3.82(3021)
= 33,025.78
b.) ŷ = 859 + 5.76x
1 + 3.82x
2
The predicted annual rice yield is 33,025.78 pounds.

Larson & Farber, Elementary Statistics: Picturing the World, 3e 28
Assessment

Larson & Farber, Elementary Statistics: Picturing the World, 3e 29
Assessment
Direction:
a. Calculate the correlation (r) between the two
variables.
b. Write a brief interpretation of this correlation,
including the strength, direction, and an
explanation of the effect.
c. Find the equation of the regression line.

Larson & Farber, Elementary Statistics: Picturing the World, 3e 30
Assessment
Age (x)
43212542573328
Glucose
Level
(y)
99657975878270
1. 1.
2. 2.
Age (x)
20212445465460
Weight
(y) 123132145155160162150

Larson & Farber, Elementary Statistics: Picturing the World, 3e 31
Problem Solving

Larson & Farber, Elementary Statistics: Picturing the World, 3e 32
Problem Solving
1. Alice and Leo did a study on feelings of stress
and life satisfaction during Quarantine.
Participants completed a measure on how stressed
they were feeling (on a 1 to 30 scale) and a measure
of how satisfied they felt with their lives (measures
on a 1 to 10 scale). The table below indicates the
participants’ scores.

Larson & Farber, Elementary Statistics: Picturing the World, 3e 33
Problem Solving
Participants
(#)
Stress
Score (X)
Life
Satisfaction
(Y)
1 11 7
2 25 1
3 19 4
4 7 9
5 23 2
6 6 8
7 11 8
8 22 3
9 25 3
10 10 6

Larson & Farber, Elementary Statistics: Picturing the World, 3e 34
Problem Solving
a.Calculate the correlation (r) between stress and
life satisfaction.
b.Write a brief interpretation of this correlation,
including the strength, direction, and an
explanation of the effect.
c.Can you say that being more stressed causes a
lower level of life satisfaction? Why and why
not?

Larson & Farber, Elementary Statistics: Picturing the World, 3e 35
Problem Solving
2. In a biology experiment a number of cultures
from Brgy. Aplaya Lake were grown in the
laboratory of ANHS. The numbers of bacteria, in
millions, and their ages, in days, are given below.
Age (x)
1 2 3 4 5 6 7 8
No. of
bacteria
(y)
34106135181192231268300

Larson & Farber, Elementary Statistics: Picturing the World, 3e 36
Problem Solving
a.Calculate the correlation (r) and write a brief
interpretation of this correlation, including the
strength, direction, and an explanation of the effect.
b.Some late readings were taken and are given below.
X 13 14 15
y 400 403 405
Add these points to you graph and describe what
they show.

Larson & Farber, Elementary Statistics: Picturing the World, 3e 37
Problem Solving
3. A metal rod was gradually heated and its length,
L, was measured at various temperature, T.
Temperature
(C)
152025303540
Length (cm)
100103.8106.1112116.1119.9

Larson & Farber, Elementary Statistics: Picturing the World, 3e 38
Problem Solving
a.Calculate the correlation (r) and write a brief
interpretation of this correlation, including the
strength, direction, and an explanation of the effect.
b.Do you suspect a major inaccuracy in any of the
recorded values? If so, discard any you consider
untrustworthy and find the new value of r.

Larson & Farber, Elementary Statistics: Picturing the World, 3e 39
What I Can Do
Answer the following questions:
1. What are the three types of correlation?
2. How will we know if we have a perfect
correlation?
3. Can we consider a correlation of 0.02 significant?

Larson & Farber, Elementary Statistics: Picturing the World, 3e 40
What I Have Learned
I have learned that …
I understand that …
I realized that …

Larson & Farber, Elementary Statistics: Picturing the World, 3e 41
Reflection
Answer the following questions about your personal
insights about the lesson using the prompts below:
Compare your recent situation to last year
situation before the pandemic?
Is vaccination enough to stop the spread of COVID
19 virus? Why and why not?
Tags