一比一原版(UCD毕业证书)爱尔兰都柏林大学毕业证成绩单

ynlsmv4ja 9 views 54 slides Jun 03, 2024
Slide 1
Slide 1 of 54
Slide 1
1
Slide 2
2
Slide 3
3
Slide 4
4
Slide 5
5
Slide 6
6
Slide 7
7
Slide 8
8
Slide 9
9
Slide 10
10
Slide 11
11
Slide 12
12
Slide 13
13
Slide 14
14
Slide 15
15
Slide 16
16
Slide 17
17
Slide 18
18
Slide 19
19
Slide 20
20
Slide 21
21
Slide 22
22
Slide 23
23
Slide 24
24
Slide 25
25
Slide 26
26
Slide 27
27
Slide 28
28
Slide 29
29
Slide 30
30
Slide 31
31
Slide 32
32
Slide 33
33
Slide 34
34
Slide 35
35
Slide 36
36
Slide 37
37
Slide 38
38
Slide 39
39
Slide 40
40
Slide 41
41
Slide 42
42
Slide 43
43
Slide 44
44
Slide 45
45
Slide 46
46
Slide 47
47
Slide 48
48
Slide 49
49
Slide 50
50
Slide 51
51
Slide 52
52
Slide 53
53
Slide 54
54

About This Presentation

原版定制【微信:41543339】【(UCD毕业证书)爱尔兰都柏林大学毕业证成绩单】【微信:41543339】成绩单、外壳、offer、留信学历认证(永久存档真实可查)采用学校原版纸张、特殊工艺完全按照原版一比一制作(包括:隐形水印,阴�...


Slide Content

Business Analytics MBMG-7104/ ITHS-2202/ IMAS-3101/ IMHS-3101 @Ravindra Nath Shukla (PhD Scholar) ABV-IIITM Regression Analysis

Regression Analysis Regression analysis is a set of statistical process for estimating the relationship between a dependent variable (response) and one or more independent variables(aka explanatory). For example, when a series of Y numbers (such as the monthly sales of cameras over a period of years) is causally connected with the series of X numbers (the monthly advertising budget), then it is beneficial to establish a relationship between X and Y in order to forecast Y . 2 @Ravindra Nath Shukla (PhD Scholar) ABV-IIITM

3 @Ravindra Nath Shukla (PhD Scholar) ABV-IIITM Types of Regression There are various types of regressions. Each type has its own importance on different scenarios, but at the core, all the regression methods analyze the effect of the independent variable on dependent variables. Some important types of regression models are :

4 @Ravindra Nath Shukla (PhD Scholar) ABV-IIITM Linear Regression Linear regression attempts to model the relationship between dependent (Y) and independent (X) variables by fitting a linear equation to observed data. The case of one independent variable is called Simple Linear Regression , for more than one independent variable the process is called Multiple Linear Regression.  

5 @Ravindra Nath Shukla (PhD Scholar) ABV-IIITM A. Simple Linear Regression (SLR) Linear regression models describe the relationship between variables by fitting a line to the observed data. Linear regression uses a straight line, while logistic and non-linear regression uses curved lines. SLR assumes that at least to some extend, the behavior of one variable (Y) is the result of a functional relationship between the two variables ( X & Y).  

6 @Ravindra Nath Shukla (PhD Scholar) ABV-IIITM Objectives of Regression Establish if there is a relationship between two variables (X & Y). Forecast new observation.  

7 @Ravindra Nath Shukla (PhD Scholar) ABV-IIITM Classical Assumptions of Linear regression Linear relationship No-Autocorrelation : Regressors and error term has no correlation. No-Multicollinearity : Independent variables are not correlated. Homoscedasticity: The error term is the same across all values of the independent variables. Multivariate Normality : Data should be normally distributed  

8 @Ravindra Nath Shukla (PhD Scholar) ABV-IIITM Mathematical expression of Linear regression Assume Y is a dependent variable That depends on the realization of the independent variable X. We know the linear equation of a simple line is : (1) Where, m = gradient or slope c = y -intercept or height at which the line crosses the y –axis.   Intercept c Slope m X Y

9 @Ravindra Nath Shukla (PhD Scholar) ABV-IIITM For SLR model, Equation (1) could we written as : (2) Where, m = gradient or slope β = y -intercept = value where the regression line crosses the y –axis. β 1 = Coefficient or slope of X. Intercept β Slope β 1 X Y

10 @Ravindra Nath Shukla (PhD Scholar) ABV-IIITM The slope or coefficient of X denotes the relationship between X and Y. With unit change in X, Y changed as no. of times of X coefficient. In other words β 1 represents the sensitivity of Y changes with X. In the equation, Y = β + β 1 X, β gives the value of variable Y, when X = 0. Intercept β Slope β 1 X Y

Consider the demand data given in the table below. @Ravindra Nath Shukla (PhD Scholar) ABV-IIITM Example Month: Independent Variable (X) Demand : Dependent Variable (Y) 1 9 2 15 3 32 4 48 5 52 6 60 7 39 8 65 9 90 10 93 Here , The slope or coefficient of X is β 1 = 8.6485 denotes that for one unit change in X, Y changes 8.6485

@Ravindra Nath Shukla (PhD Scholar) ABV-IIITM Data in general may not intersect or passes through the regression line. Rather, there exist some errors on random component, which can be measured as distance between true value and predicted value. The regression model must include these error terms as : (3) For sample data, equation (3) can be written as : (4) Non-random component Random component

@Ravindra Nath Shukla (PhD Scholar) ABV-IIITM To minimize the error or random component , SLR uses OLS method and calculate the value of and as : The intercept or coefficient of X, ORDINARY LEAST SQUARES (OLS)

Standard error of slope and Intercept

@Ravindra Nath Shukla (PhD Scholar) ABV-IIITM To minimize the error or random component , SLR uses OLS method The method of least squares gives the best equation under the assumptions stated below : The regression model is linear in regression parameters. The explanatory variable, X, is assumed to be non-random or non-stochastic (i.e., X is deterministic). The conditional expected value of the error terms or residuals, E( ε i | X i ), is zero. In case of time series data, error terms are uncorrelated, that is, Cov ( ε i , ε i ) = 0 for all i ≠ j. The variance of the errors, Var( ε i |X i ), is constant for all values of X i . or follows homoscedasticity. The error terms, ε i , follow a normal distribution. Explanation of OLS method

@Ravindra Nath Shukla (PhD Scholar) ABV-IIITM In ordinary least squares, the objective is find the optimal values of β and β 1 that will minimize the Sum of Squares Errors (SSE) given in Eq. (6) as (6)

@Ravindra Nath Shukla (PhD Scholar) ABV-IIITM To find the optimal values of β and β 1 that will minimize SSE, we have to equate the partial derivative of SSE with respect to β and β 1 to zero [ Eqs . (7) and (9) as : Solving Eq. (7) for β , the estimated value of β is given by Differentiating SSE with respect to β 1 , we get Substituting the value of β from Eq. (8) in Eq. (9), we get (7) (8) (9)

@Ravindra Nath Shukla (PhD Scholar) ABV-IIITM Thus, the value of β 1 is given by

Solution Manual Method :- 19 Time: (X) Demand (Y) (X- X̅)* (Y- Y̅) 1 9 -4.5 -41.3 20.25 1705.69 185.85 2 15 -3.5 -35.3 12.25 1246.09 123.55 3 32 -2.5 -18.3 6.25 334.89 45.75 4 48 -1.5 -2.3 2.25 5.29 3.45 5 52 -0.5 1.7 0.25 2.89 -0.85 6 60 0.5 9.7 0.25 94.09 4.85 7 39 1.5 -11.3 2.25 127.69 -16.95 8 65 2.5 14.7 6.25 216.09 36.75 9 90 3.5 39.7 12.25 1576.09 138.95 10 93 4.5 42.7 20.25 1823.29 192.15 5.5 50.3     82.5 7132.1 713.5     Σ(X- X̅)*(Y- Y̅) Time: (X) Demand (Y) (X- X̅)* (Y- Y̅) 1 9 -4.5 -41.3 20.25 1705.69 185.85 2 15 -3.5 -35.3 12.25 1246.09 123.55 3 32 -2.5 -18.3 6.25 334.89 45.75 4 48 -1.5 -2.3 2.25 5.29 3.45 5 52 -0.5 1.7 0.25 2.89 -0.85 6 60 0.5 9.7 0.25 94.09 4.85 7 39 1.5 -11.3 2.25 127.69 -16.95 8 65 2.5 14.7 6.25 216.09 36.75 9 90 3.5 39.7 12.25 1576.09 138.95 10 93 4.5 42.7 20.25 1823.29 192.15 5.5 50.3     82.5 7132.1 713.5     Σ(X- X̅)*(Y- Y̅) = 8.6485   = 50.3 – 8.6485*5.5 = 2.733   Y = β = 2.733+ 8.6485*X   @Ravindra Nath Shukla (PhD Scholar) ABV-IIITM

Example : Consider the demand data given in the table below. 20 Time: Independent Variable (X) 1 2 3 4 5 6 7 8 9 10 Demand : Dependent Variable (Y) 9 15 32 48 52 60 39 65 90 93 Time: Independent Variable (X) Demand : Dependent Variable (Y) 1 9 2 15 3 32 4 48 5 52 6 60 7 39 8 65 9 90 10 93 Slope (b) 8.65 Intercept (a) 2.73 Solution Excel :- @Ravindra Nath Shukla (PhD Scholar) ABV-IIITM

The Excel functions give b = 8.65 and a = 2.73. Use them in equation, Y = a + bX , to make a forecast. For example, for period 11 ( X = 11), Forecast = 2.73 + 11*8.65 = 97.87. Similarly, for period 12, Forecast = 2.73 + 12*8.65 = 106.52. 21 @Ravindra Nath Shukla (PhD Scholar) ABV-IIITM

Coefficient of Determination 22 @Ravindra Nath Shukla (PhD Scholar) ABV-IIITM The coefficient of determination ( R 2 ), where R is the value of the coefficient of correlation, is a measure of the variability that is accounted for by the regression line for the dependent variable. Calculation :

Where, Y i = actual value of Y = estimated value of Y = mean value of Y The coefficient of determination always falls between 0 and 1. For example, if r = 0.8, the coefficient of determination is r 2 = 0.64 meaning that 64% of the variation in Y is due to variation in X . The remaining 36% variation in the value of Y is due to other variables. If the coefficient of determination is low, multiple regression analysis may be used to account for all variables affecting the independent variable Y .   23 @Ravindra Nath Shukla (PhD Scholar) ABV-IIITM

24 Solution : Coefficient of determination and Std. Errors @Ravindra Nath Shukla (PhD Scholar) ABV-IIITM Time: (X) Demand (Y) 1 9 11.3815 \ 1705.69 1514.65 2 15 20.03 25.30 1246.09 916.2729 3 32 28.6785 11.03 334.89 467.4893 4 48 37.327 113.91 5.29 168.2987 5 52 45.9755 36.29 2.89 18.7013 6 60 54.624 28.90 94.09 18.69698 7 39 63.2725 589.15 127.69 168.2858 8 65 71.921 47.90 216.09 467.4676 9 90 80.5695 88.93 1576.09 916.2426 10 93 89.218 14.30 1823.29 1514.611 5.5 50.3 961.41 7132.1 6170.72 Time: (X) Demand (Y) 1 9 11.3815 \ 1705.69 1514.65 2 15 20.03 25.30 1246.09 916.2729 3 32 28.6785 11.03 334.89 467.4893 4 48 37.327 113.91 5.29 168.2987 5 52 45.9755 36.29 2.89 18.7013 6 60 54.624 28.90 94.09 18.69698 7 39 63.2725 589.15 127.69 168.2858 8 65 71.921 47.90 216.09 467.4676 9 90 80.5695 88.93 1576.09 916.2426 10 93 89.218 14.30 1823.29 1514.611 5.5 50.3 961.41 7132.1 6170.72 = =0.8652   = 1- 0.8652   = 10.96 * 0.683 = 7.488    

Exercise 25 @Ravindra Nath Shukla (PhD Scholar) ABV-IIITM From the following data of a book store ABC, derive the regression equation for the effect of purchases on sales of books. Also, calculate standard errors and coefficient of determination. 2.

Solution 1 = 70 – 0.6132*90 = 14.812   = 0.6132   Y = β = 0.6132+ 14.812*X  

The coefficient of determination = Standard Errors : 27 @Ravindra Nath Shukla (PhD Scholar) ABV-IIITM = = =0.8339   = 7.718 * 1.172= 9.045   7   X Y 91 71 70.61 0.38 1 0.15 97 75 74.29 18.42 25 0.50 108 69 81.04 121.83 1 144.90 121 97 89.01 361.35 729 63.85 67 70 55.90 198.92 198.92 124 91 90.85 434.67 441 0.02 51 39 46.08 571.95 961 50.19 73 61 59.58 108.68 81 2.03 111 80 82.88 165.82 100 8.28 57 47 49.76 409.50 529 7.64 90 70   2391.51 2868.00 476.49 X Y 91 71 70.61 0.38 1 0.15 97 75 74.29 18.42 25 0.50 108 69 81.04 121.83 1 144.90 121 97 89.01 361.35 729 63.85 67 70 55.90 198.92 198.92 124 91 90.85 434.67 441 0.02 51 39 46.08 571.95 961 50.19 73 61 59.58 108.68 81 2.03 111 80 82.88 165.82 100 8.28 57 47 49.76 409.50 529 7.64 90 70   2391.51 2868.00 476.49

28 @Ravindra Nath Shukla (PhD Scholar) ABV-IIITM A. Multiple Linear Regression (MLR) Predicting an outcome (dependent variable) based upon several independent variables simultaneously.    Why is this important?  Behavior is rarely a function of just one variable, but is instead influenced by many variables.  So the idea is that we should be able to obtain a more accurate predicted score if using multiple variables to predict our outcome.   

@Ravindra Nath Shukla (PhD Scholar) ABV-IIITM Travel time (y) Km travelled (x 1 ) No. of Deliveries (x 2 ) Independent variables D ependent variables Potential multicollinearity Multiple regression many-to-one DV IV IV IV IV Multiple regression many-to-one 10 relationships to consider!

30 @Ravindra Nath Shukla (PhD Scholar) ABV-IIITM The functional form of MLR is given by The variable Y is the dependent variable (response variable or outcome variable); X 1 , X 2 , …, X k are independent variables (predictor variables or explanatory variables); β is a constant; β 1 , β 2 , …, β k are called the partial regression co- efficients corresponding to the explanatory variables X 1 , X 2 , …, X k respectively; and ε i is the error term (or residual).

31 @Ravindra Nath Shukla (PhD Scholar) ABV-IIITM IF ε i = 0, then The estimated value of Y will be In MLR each coefficient is interpreted as estimated change in Y corresponding to the unit change in a independent variable (X 1 ), when all other variables held constant (X 2 , X 3 ,… X k ).

Simple vs. Multiple Regression 32 @Ravindra Nath Shukla (PhD Scholar) ABV-IIITM

New Considerations @Ravindra Nath Shukla (PhD Scholar) ABV-IIITM Adding more independent variables to a multiple regression procedure does not mean the regression will be "better" or offer better predictions; in fact it can make things worse. This is called OVERFITTING . The addition of more independent variables creates more relationships among them. So not only are the independent variables potentially related to the dependent variable, they are also potentially related to each other. When this happens, it is called MULTICOLLINEARITY.

@Ravindra Nath Shukla (PhD Scholar) ABV-IIITM The ideal is for all of the independent variables to be correlated with the dependent variable but NOT with each other. Because of multicollinearity and overfitting , there is a fair amount of prep-work is required before conducting multiple regression analysis : Correlations Scatter plots Simple regressions

Steps in MLR @Ravindra Nath Shukla (PhD Scholar) ABV-IIITM Generate a list of potential variables; independent(s) and dependent Collect data on the variables Check the relationships between each independent variable and the dependent variable using scatterplots and correlations Check the relationships among the independent variables using scatterplots and correlations Conduct simple linear regressions for each IV/DV pair (Optional). Use the non-redundant independent variables in the analysis to find the best fitting model Use the best fitting model to make predictions about the dependent variable.

Example @Ravindra Nath Shukla (PhD Scholar) ABV-IIITM Aditya Delivery Service (ADS) offers same-day delivery for letters, packages, and other small courier parcels. They use Google Maps to group individual deliveries into one trip to reduce time and fuel costs. Some trips cost more than one delivery. The ADS company wants to estimate how long a delivery will take based on three factors: the total distance of the trip in Kilometers (KMs) the number of deliveries that must be made during the trip, and the daily price of petrol.

@Ravindra Nath Shukla (PhD Scholar) ABV-IIITM In this case, we can predict the total travel time using the distance traveled, number of deliveries on each trip, and daily petrol price. Distance Travelled (Kms), (X 1 ) No. of Deliveries (X 2 ) Petrol Price ($), (X 3 ) Travel Time (hrs), (Y) 89 4 3.84 7 66 1 3.19 5.4 78 3 3.78 6.6 111 6 3.89 7.4 44 1 3.57 4.8 77 3 3.57 6.4 80 3 3.03 7 66 2 3.51 5.6 109 5 3.54 7.3 76 3 3.25 6.4 Step 1 : Generate a list of potential variables; independent(s) and dependent Step 2 : Collect data on the variables To conduct this analysis a random sample of 10 past trips and record four pieces of information for each trip is given as:

Step 3 : Scatterplot IV to DV Step 4 : Scatterplot IV to IV Distance Travelled (Kms), (X 1 ) Travel Time (hrs), (Y) Travel Time (hrs), (Y) Travel Time (hrs), (Y) No. of Deliveries (X 2 ) Petrol Price ($), (X 3 ) Distance Travelled (Kms), (X 1 ) Distance Travelled (Kms), (X 1 ) No. of Deliveries (X 2 ) No. of Deliveries (X 2 ) Petrol Price ($), (X 3 ) Petrol Price ($), (X 3 )

Step 5. Conduct simple linear regressions for each IV/DV pair (Optional). Step 6. Use the non-redundant independent variables in the analysis to find the best fitting model. In our example Step 3 suggest that Petrol Price ($), (X3 ) is redundant due to overfitting issue X1 or X2 is redundant as X1 & X2 shows multicollinearity Step 7. Use the best fitting model to make predictions about the dependent variable.

Excel: Multiple regression Distance Travelled (Kms), (X 1 ) No. of Deliveries (X 2 ) Petrol Price ($), (X 3 ) Travel Time (hrs), (Y) 89 4 3.84 7 66 1 3.19 5.4 78 3 3.78 6.6 111 6 3.89 7.4 44 1 3.57 4.8 77 3 3.57 6.4 80 3 3.03 7 66 2 3.51 5.6 109 5 3.54 7.3 76 3 3.25 6.4 For practice purpose, if we solve our above example considering one response variable y and three predictor variables X1, X2, and X3.

SUMMARY OUTPUT @Ravindra Nath Shukla (PhD Scholar) ABV-IIITM Regression Statistics Multiple R 0.946 R Square 0.895 Adjusted R Square 0.842 Standard Error 0.345 Observations 10 ANOVA   df SS MS F Significance F Regression 3 6.056 2.019 16.991 0.002 Residual 6 0.713 0.119 Total 9 6.769         Coefficients Standard Error t Stat P-value Lower 95% Upper 95% Lower 95.0% Upper 95.0% Intercept 6.211 2.321 2.677 0.037 0.533 11.890 0.533 11.890 Distance Travelled (Kms), (X1) 0.014 0.022 0.636 0.548 -0.040 0.068 -0.040 0.068 No. of Deliveries (X2) 0.383 0.300 1.277 0.249 -0.351 1.117 -0.351 1.117 Petrol Price ($), (X3) -0.607 0.527 -1.152 0.293 -1.895 0.682 -1.895 0.682 Regression Equation:

Manual solution : Multiple regression Suppose, for practice purpose w e use Y as dependent variable  and X1 , and X2 as independent variable as: Travel Time (hrs), (Y) Distance Travelled (Kms), (X 1 ) No. of Deliveries (X 2 ) 7 89 4 5.4 66 1 6.6 78 3 7.4 111 6 4.8 44 1 6.4 77 3 7 80 3 5.6 66 2 7.3 109 5 6.4 76 3

Step 1: Calculate X 1 2 , X 2 2 , X 1 y, X 2 y and X 1 X 2 . Travel Time ( hrs ), (Y) Distance Travelled ( Kms ), (X 1 ) No. of Deliveries (X 2 ) X 1 2 X 2 2 X 1 Y X 2 Y X 1 X 2 7 89 4 7921 16 623 28 356 5.4 66 1 4356 1 356.4 5.4 66 6.6 78 3 6084 9 514.8 19.8 234 7.4 111 6 12321 36 821.4 44.4 666 4.8 44 1 1936 1 211.2 4.8 44 6.4 77 3 5929 9 492.8 19.2 231 7 80 3 6400 9 560 21 240 5.6 66 2 4356 4 369.6 11.2 132 7.3 109 5 11881 25 795.7 36.5 545 6.4 76 3 5776 9 486.4 19.2 228 SUM 63.9 796 31 66960 119 5231.3 209.5 2742 Mean 6.39 79.6 3.1          

Step 2: Calculate Regression Sums = = 66960 – 63361.6= 3598.4   = = 119 - 96.1= 22.9   =5231.3 = 5231.3 - 5086.44 =144.86   = 209.5 = 209.5- 198.09=11.41   = 2467.6=274.4   X 1 2 X 2 2 X 1 Y X 2 Y X 1 X 2 66960 119 5231.3 209.5 2742 Sum X1 = 796   Sum X2= 31 n=10  Sum Y=63.9    

Calculation b , b 1 , and b 2 The formula to calculate b 1 is: The formula to calculate b 2 is: The formula to calculate b is:      

Step 3: Calculate b , b 1 , and b 2 Calculation of b 1 is: The formula to calculate b 2 is: The formula to calculate b is:   X 1 2 X 2 2 X 1 Y X 2 Y X 1 X 2 Regression SUM 3598.4 22.9 144.86 11.41 274.4 Mean Y =6.39 Mean X1= 79.6 Mean X2= 3.1        

Step 4 : Place b0, b1, and b2 in the estimated linear regression equation  

SUMMARY OUTPUT       Regression Statistics   Multiple R 0.9335 R Square 0.8714 Adjusted R Square 0.8347 Standard Error 0.3526 Observations 10 ANOVA   df SS MS F Significance F Regression 2 5.8985 2.9493 23.7161 0.0008 Residual 7 0.8705 0.1244     Total 9 6.769         Coefficients Standard Error t Stat P-value Intercept 3.7322 0.8870 4.2077 0.0040 X Variable 1 0.0262 0.0200 1.3101 0.2315 X Variable 2 0.1840 0.2509 0.7335 0.4871

Steps for manual solution : Multiple regression Suppose, w e have the following dataset with one response variable y and two predictor variables X1, and X2.

Step 1: Calculate X 1 2 , X 2 2 , X 1 y, X 2 y and X 1 X 2 .

Step 2 : Calculate Regression Sums X 1 2 X 2 2 X 1 Y X 2 Y X 1 X 2 38767 2823 101895 25364 9859   Sum X1 = 555 Sum X2 = 145   n=8 Sum Y = 1452  

Step 3: Calculate b , b 1 , and b 2 X 1 2 X 2 2 X 1 Y X 2 Y X 1 X 2 Regression SUM 263.875 194.875 1162.5 -953.5 -200.375          

Step 4 : Place b0, b1, and b2 in the estimated linear regression equation

Thank you!