multiple regression in statistics114.ppt

ssuser3c3f88 7 views 21 slides Jul 05, 2024
Slide 1
Slide 1 of 21
Slide 1
1
Slide 2
2
Slide 3
3
Slide 4
4
Slide 5
5
Slide 6
6
Slide 7
7
Slide 8
8
Slide 9
9
Slide 10
10
Slide 11
11
Slide 12
12
Slide 13
13
Slide 14
14
Slide 15
15
Slide 16
16
Slide 17
17
Slide 18
18
Slide 19
19
Slide 20
20
Slide 21
21

About This Presentation

statistics


Slide Content

Business Statistics,4e, by Ken Black. © 2003 John Wiley & Sons.
14-1
Business Statistics, 4e
by Ken Black
Chapter 14
Multiple Regression
AnalysisDiscrete Distributions

Business Statistics,4e, by Ken Black. © 2003 John Wiley & Sons.
14-2
Learning Objectives
•Develop a multiple regression model.
•Understand and apply significance tests of the
regression model and its coefficients.
•Compute and interpret residuals, the standard
error of the estimate, and the coefficient of
determination.
•Interpret multiple regression computer output.

Business Statistics,4e, by Ken Black. © 2003 John Wiley & Sons.
14-3
Regression Models
Probabilistic MultipleRegression Model:
Y = 
0+ 
1X
1+ 
2X
2+ 
3X
3+ . . . + 
kX
k+ 
Y= the value of the dependent (response) variable

0= the regression constant

1= the partial regression coefficient of independent variable 1

2= the partial regression coefficient of independent variable 2

k= the partial regression coefficient of independent variable k
k= the number of independent variables
= the error of prediction

Business Statistics,4e, by Ken Black. © 2003 John Wiley & Sons.
14-4
Estimated Regression Modelst variableindependen ofnumber =
t coefficien regression of estimate
3t coefficien regression of estimate
2t coefficien regression of estimate
1t coefficien regression of estimate
constant regression of estimate
of valuepredicted
ˆ
:
ˆ
3
2
1
0
3322110
k
k
YY
where
Y
b
b
b
b
b
XbXbXbXbb
k
kk






 

Business Statistics,4e, by Ken Black. © 2003 John Wiley & Sons.
14-5
Multiple Regression Model with Two
Independent Variables (First-Order)Y
where
Y
whereY
X X
bbXbX
b
b
b
  


 




0 1
1
2
2
0
1
2
0 1 1 2 2
0
1
2






:

:

= the regression constant
the partial regression coefficient for independent variable 1
the partial regression coefficient for independent variable 2
= the error of prediction
predicted value of Y
estimate of regression constant
estimate of regression coefficient 1
estimate of regression coefficient 2
Population Model
Estimated Model

Business Statistics,4e, by Ken Black. © 2003 John Wiley & Sons.
14-6
Response Plane for First-Order Two-Predictor
Multiple Regression Model
X
1
X
2
Response Plane
Y
1
Vertical Intercept
Y

Business Statistics,4e, by Ken Black. © 2003 John Wiley & Sons.
14-7
Least Squares Equations for k = 20 1 1 2 2
0 1 1 1
2
2 1 2 1
0 2 1 1 2 2 2
2
2
bbXbXY
bXbXbXX XY
bXbXXbX XY
n  
  
  
  
   
   
For multiple regression models with two independent
variables, the result is three simultaneous equations with three
unknowns (b
0
, b
1
, and b
2
).

Real Estate example
Business Statistics,4e, by Ken Black. © 2003 John Wiley & Sons.
14-8
•A real estate study was conducted in a small Louisiana city to
determine what variables, if any, are related to the market price
of a home.
•Several variables were explored, including the number of
bedrooms, the number of bathrooms, the age of the house, the
number of square feet of living space, the total number of square
feet of space, and the number of garages.
•Suppose the researcher wants to develop a regression model to
predict the market price of a home by using only two
variables, “total number of square feet in the house” and
“the age of the house.”

Business Statistics,4e, by Ken Black. © 2003 John Wiley & Sons.
14-9
Real Estate Data
Observation Y X
1 X
2 Observation Y X
1 X
2
1 63.0
65.1
1,605 35 13 79.7 2,121 14
2 2,489 45 14 84.5 2,485 9
3 69.9
7
1,553 20 15 96.0 2,300 19
4 76.8 2,404 32 16 109.5 2,714 4
5 73.9 1,884 25 17 102.5 2,463 5
6 77.9 1,558 14 18 121.0 3,076 7
7 74.9 1,748 8 19 104.9 3,048 3
8 78.0 3,105 10 20 128.0 3,267 6
9 79.0 1,682 28 21 129.0 3,069 10
10 63.4 2,470 30 22 117.9 4,765 11
11 79.5 1,820 2 23 140.0 4,540 8
12 83.9 2,143 6
Market
Price
($1,000)
Square
Feet
Age
(Years)
Market
Price
($1,000)
Square
Feet
Age
(Years)

Using Excel
Business Statistics,4e, by Ken Black. © 2003 John Wiley & Sons.
14-10

Business Statistics,4e, by Ken Black. © 2003 John Wiley & Sons.
14-11
Excel Output
Regression Statistics
Multiple R 0.86
R Square 0.74
Adjusted R 0.72
Standard Error 11.96
Observations 23.00
ANOVA
df SS MS FSignificance F
Regression 2.008189.724094.8628.63 0.00
Residual 20.002861.02143.05
Total 22.0011050.74
CoefficientsStandard
Error t StatP-valueLower 95%
Upper
95%
Lower
95.0%
Upper
95.0%
Intercept 57.35 10.01 5.73 0.00 36.4878.2336.4878.23
Square Feet 0.02 0.00 5.63 0.00 0.010.02 0.01 0.02
Age -0.67 0.23 -2.92 0.01 -1.14-0.19 -1.14 -0.19

Estimated Regression Model for
Real Estate Example
Business Statistics,4e, by Ken Black. © 2003 John Wiley & Sons.
14-12XXY
21667.00177.04.57
ˆ

•The coefficient of X
1
(total number of square feet in the house) is
.0177, which means that if all other variables being held constant,
the addition of 1 square foot of space in the house results in a
predicted increase of $17.70 in the price of the home.
•The coefficient of X
2
(age) is (-0.667) means that if all other
variables being held constant, a one-unit increase in the age of the
house (1 year) will result in ($1,000) = -$667, a predicted $667
drop in the price.

Business Statistics,4e, by Ken Black. © 2003 John Wiley & Sons.
14-13
Predicting the Price of Home 
dollars thousand658.93
12666.025000177.04.57
ˆ
,12 and 2500
666.00177.04.57
ˆ
21
21




Y
XXFor
XXY

Business Statistics,4e, by Ken Black. © 2003 John Wiley & Sons.
14-14
Evaluating the Multiple Regression ModelH
H
k
a
0
1 2 3
0:
:



At least one of the regression coefficients is 0 H
H
H
H
H
H
H
H
a a
a
k
a
k
0
1
1
0
3
3
0
2
2
0
0
0
0
0
0
0
0
0
:
:
:
:
:
:
:
:

















Significance Tests for
Individual Regression
Coefficients
Testing the
Overall Model

Business Statistics,4e, by Ken Black. © 2003 John Wiley & Sons.
14-15
Testing the Overall Model for the Real
Estate Example0 is tscoefficien regression theof oneleast At :
0:
2
1
0


aH
H MSR
SSR
k
MSE
SSE
nk
F
MSR
MSE
 


1
ANOVA
dfSS MS F p-value
Regression 28189.7234094.86 28.63 .000
Residual (Error) 202861.017143.1
Total 2211050.74.,,
.
..,
01220
585
2863585
F
FCal

  reject H.0

Business Statistics,4e, by Ken Black. © 2003 John Wiley & Sons.
14-16
Significance Test of the Regression
Coefficients for the Real Estate ExampleH
H
H
H
a
a
0
1
1
0
2
2
0
0
0
0
:
:
:
:








t
Cal = 5.63 > 2.086, rejectH
0.
CoefficientsStd Dev t Stat p
x
1(Sq.Feet) 0.0177 0.0031465.63 .000
x
2(Age) -0.666 0.2280 -2.92 .008
t
.025,20 = 2.086

Business Statistics,4e, by Ken Black. © 2003 John Wiley & Sons.
14-17
Residuals and Sum of Squares Error for the
Real Estate Example
SSE
ObservationY ObservationY
1 43.0 42.466 0.534 0.285 13 59.7 65.602 -5.902 34.832
2 45.1 51.465 -6.36540.517 14 64.5 75.383-10.883118.438
3 49.9 51.540 -1.6402.689 15 76.0 65.442 10.558111.479
4 56.8 58.622 -1.8223.319 16 89.5 82.772 6.728 45.265
5 53.9 54.073 -0.1730.030 17 82.5 77.659 4.841 23.440
6 57.9 55.627 2.273 5.168 18 101.0 87.187 13.813190.799
7 54.9 62.991 -8.09165.466 19 84.9 89.356 -4.456 19.858
8 58.0 85.702 -27.702767.388 20 108.0 91.237 16.763280.982
9 59.0 48.495 10.505110.360 21 109.0 85.064 23.936572.936
10 63.4 61.124 2.276 5.181 22 97.9114.447-16.547273.815
11 59.5 68.265 -8.76576.823 23 120.0112.460 7.540 56.854
12 63.9 71.322 -7.42255.092 2861.017
Y YY
 
2
YY
 
Y YY
 
2
YY

Business Statistics,4e, by Ken Black. © 2003 John Wiley & Sons.
14-18
MINITAB Residual Diagnostics for the Real
Estate Problem
3020100-10-20-30
6
5
4
3
2
1
0
Residual
F
r
e
q
u
e
n
c
y
HistogramofResiduals
20100
40
30
20
10
0
-10
-20
-30
-40
ObservationNumber
R
e
s
i
d
u
a
l
IChartofResiduals
X=-7.2E-14
3.0SL=31.26
-3.0SL=-31.26
14013012011010090807060
20
10
0
-10
-20
-30
Fit
R
e
s
i
d
u
a
l
Residualsvs.Fits
210-1-2
20
10
0
-10
-20
-30
NormalPlotofResiduals
NormalScore
R
e
s
i
d
u
a
l
ResidualModelDiagnostics

Business Statistics,4e, by Ken Black. © 2003 John Wiley & Sons.
14-19
SSE and Standard Error of the EstimateeS
SSE
nk
where





1
2861
2321
1196.
: n = number of observations
k = number of independent variables
SSE
ANOVA
dfSS MS F P
Regression 28189.74094.9 28.63 .000
Residual (Error) 202861.0143.1
Total 2211050.7

Business Statistics,4e, by Ken Black. © 2003 John Wiley & Sons.
14-20
Coefficient of Multiple Determination (R
2
)2
2
8189723
1105074
741
1 1
2861017
1105074
741
R
R
SSR
SSY
SSE
SSY
  
  
.
.
.
.
.
.
SSE
ANOVA
dfSS MS F p
Regression 28189.74094.89 28.63 .000
Residual (Error) 202861.0143.1
Total 2211050.7
SS
YY SSR
Reports the proportion of total variation in y explained by all x
variables taken together

Business Statistics,4e, by Ken Black. © 2003 John Wiley & Sons.
14-21
Adjusted R
2adj
SSE
nk
SSY
n
R.
.
.
..
2
1
1
1
1
2861017
2321
1105074
231
1285715






ANOVA
dfSS MS F p
Regression 28189.74094.9 28.63 .000
Residual (Error) 202861.0143.1
Total 2211050.7
SS
YYSSEn-k-1n-1








1
1
)1(1
22
kn
n
RR
A
Shows the proportion of variation
in y explained by all x variables
adjusted for the number of x
variables used
Tags