Regression analysis in excel

ChaudharyAwaisSalman 789 views 14 slides Nov 17, 2020
Slide 1
Slide 1 of 14
Slide 1
1
Slide 2
2
Slide 3
3
Slide 4
4
Slide 5
5
Slide 6
6
Slide 7
7
Slide 8
8
Slide 9
9
Slide 10
10
Slide 11
11
Slide 12
12
Slide 13
13
Slide 14
14

About This Presentation

Lecture given in course Applied Statistics in Engineering
OAH305 in Mälardalen University Sweden


Slide Content

Lecture:
Simple Linear Regression
in MICROSOFT EXCEL
Chaudhary Awais Salman
Doctoral Researcher in Future Energy
Course instructor
School of Business, Society and Engineering
FuutureEnergy –Centre of Excellence
Email:
[email protected]

Regression analysisin Excel
●Three methods are described to perform the regression analysis in EXCEL
1.By formulas
2.By graph
3.By built-in data analysis tool
2
Temperature(deg F)Icecreamsales, USD
52 185
58 215
60 332
62 325
64 408
66 406
68 412
72 522
74 445
74 545
74 640
76 522
77 544
80 614
82 614
84 620
85 627
88 627
90 632
Data set used for regression analysis

Formulas (1)
●Microsoft Excel has built-in functions such as LINEST, SLOPE, INTERCPET, and
CORREL that can help to do the linear regression.
3
The LINESTFunction uses the least
squares method and determine a best
fitted straight line between studied
variables and returns an array
describing that line.
LINESTfunction returns an array of values (aand b), so it
must be entered as anarray formula. Select two adjacent
cells in the same row, D3:F3 in this case, type the formula,
and pressCtrl + Shift + Enterto complete it.

Formulas (2)
●Microsoft Excel has built-in functions such as LINEST, SLOPE, INTERCPET, and
CORREL that can help to do the linear regression.
4
Otherwise, we can find the value of slope (a) and intercept
(b) directly by using the SLOPEand INTERCEPT
formulas

Formulas (3)
●Microsoft Excel has built-in functions such as LINEST, SLOPE, INTERCPET, and
CORREL that can help to do the linear regression.
5
Similarly correlationsand R-squaredbetween two variables can be determined by using the CORRELand
RSQ function of excel. (R-squared can also be determined by squaring the correlation)

By graph (1)
●Select the two columns with your data, including headers.
●On theInsettab, in theChartsgroup, click theScatter charticon, and select
theScatterthumbnail (the first one):
6

7
●Now, a least squares regression line needs to be drawn. Right click on any point in scatter
graph and chooseAdd Trendline
By graph (2)

8
●From Trendline options select linearas option and check the display equationand R-
squaredon chart options
By graph (3)

On theDatatab, in theAnalysisgroup, click theData Analysisbutton.
9
SelectRegressionand clickOK

●Select theInput Y Range, which is yourdependent variable. In our case, it’s (B1:B20).
●Select theInput X Range, i.e. yourindependent variable. In this example, it's the (A1:A20).
●Select your preffered output range, it can be new worksheet also
10
●Optionally select the
residual option

Summaryofresults
11

ExplanationofRegression statisticsterms
●Multiple R. It is theCorrelation Coefficient
●R Square. It is the Coefficient of Determination, which is used as an indicator of the goodness
of fit. In our example, R
2
is 0.87 (rounded to 2 digits), which is fairy good. It means that 87% of
our values fit the regression analysis model. Generally, R Squared of 95% or more is
considered a good fit.
●Adjusted R Square.It is theR squareadjusted for the number of independent variable in the
model. You will want to use this value instead ofR squarefor multiple regression analysis.
●Standard Error. It is another goodness-of-fit measure that shows the precision of your
regression analysis -the smaller the number, the more certain you can be about your
regression equation. While R
2
represents the percentage of the dependent variables variance
that is explained by the model, Standard Error is an absolute measure that shows the average
distance that the data points fall from the regression line.
●Observations. It is simply the number of observations in yourmodel.
12
https://www.ablebits.com/office-addins-blog/2018/08/01/linear-regression-analysis-excel/

ExplanationofRegression output terms
●The most useful component in this section isCoefficients. It helps us to make a model.
●X variable 1 =a = slope = 12,72
●Intercept = b = -442,387
●Equation
●Y = aX+ b
●Y = 12,72 X -442,387
13
https://www.ablebits.com/office-addins-blog/2018/08/01/linear-regression-analysis-excel/

Regression analysis output: residuals
●If you compare the estimated and actual number of sold
ice-creams corresponding to the temperature , you will
see that these numbers are slightly different:
●Estimated: 219,44 (calculated above)
●Actual: 185 (row 2 of the source data)
●Why's the difference? Because independent variables
are never perfect predictors of the dependent variables.
And the residuals can help you understand how far
away the actual values are from the predicted values:
14
https://www.ablebits.com/office-addins-blog/2018/08/01/linear-regression-analysis-excel/