Chapter 4 Economtrics PPTfor managment and Accounting student guide and lecture.pptx
blenwerke8
5 views
30 slides
Oct 21, 2025
Slide 1 of 30
1
2
3
4
5
6
7
8
9
10
11
12
13
14
15
16
17
18
19
20
21
22
23
24
25
26
27
28
29
30
About This Presentation
econometrics is the basic course in business and accounting student which enable to have a best understanding about the economic operations and practices. this document contain more detail about this course. this may help the academician , students, professionals and practitioners to have effective ...
econometrics is the basic course in business and accounting student which enable to have a best understanding about the economic operations and practices. this document contain more detail about this course. this may help the academician , students, professionals and practitioners to have effective understanding about economic practices.
Size: 1.56 MB
Language: en
Added: Oct 21, 2025
Slides: 30 pages
Slide Content
CHAPTER FOUR NON LINEAR REGRESSION ANALYSIS /BINARY CHOICE MODELS non linear regression analysis is regression analysis where the dependent variable Y is a qualitative or categorical variable Such models are called limited dependent variable models or also qualitative or categorical variable models . We concentrate on the binary case where Y i can take only two values. One example would be a model of women labor force participation (LFP).
Cont’d … The dependent variable in this case is the labor force participation (LFP) which would take the value of one (1) if the woman participates in the labor force and a value of zero (0) if the woman does not participate in the labor force. Various explanatory variables could be included: both continuous variables such as age and dichotomous variables such as gender or educational achievement. Where the response variable (Y) is dummy, categorical, limited , binary or qualitative. Such models are commonly used in social science and medical research with interesting estimation- and interpretation challenges. 2
Cont’d … The following models are the most commonly used binary dependent variable models: The Linear Probability Models The Logit Models The Probit Models 4.1. Linear Probability Models The linear probability model is simply applying of ordinary least square (OLS) method to estimate dichotomous dependent variable. The linear probability model applies the linear model. 3
Cont’d … Where is the probability that the i th observation scores 1, is the matrix containing all X values for all observations and is a vector containing all the coefficients. Assume that we want study the determinants of labor force participation (LFP) of adult men in particular town. The mathematical model is given as: Suppose that we have data for 30 observations on labor force participation (employment), marital status, age and years of schooling. The data is given in this table 4
Cont’d … 5
Cont’d … The application of Ordinary Least Square Methods on Linear Probability Model gives the regression result as displayed in Figure 1. 6
Cont’d … The interpretation of LPM estimates is direct forwarding like OLS estimates. Example the coefficient of married is negative and significant at 5% level of significance. Keeping other things remain constant the probability of being employed decreases by 0.38 as one moves from unmarried to married individual. means keeping other things remain constant, as the age of the individual increase by one year the probability of being employed decreased by 0.097%. 7
Limitations of the LPM First , the LPM assumes that the probability of labor force participation moves linearly with the value of the explanatory variable, no matter how small or large that value is. Secondly , by logic, the probability value must lie between 0 and 1. But there is no guarantee that the estimated probability values from the LPM will lie within these limits. Thirdly, the error term in the LPM is heteroscedastic, making the traditional significance tests suspect. For all these reasons, LPM is not the preferred choice for modeling dichotomous variables. The alternatives discussed in the literature are logit and probit . 8
4.3 Logit Regression Model (non- linear Probability Model) The main objective of the Logit model is to insure/ guarantee that the predicted probability of the event occurring given the value of explanatory variable remains within the [0, 1] bounds. That means, 0 ≤ Pr(Y = 1|X) ≤ 1 for all X This requires a nonlinear functional form for the probability. This can be possible if we assume that the dependent or the error term (Ui) follows some sorts of cumulative distribution function . The two important nonlinear functions which are proposed for this are the logistic CDF and the normal CDF . 9
Cont’d … The logistic CDF is given as follows: Where = It is easy to verify that as ranges from ranges between 0 and 1 and that is non-linearly related with . That means, 0≤ ≤1, for all real numbers . This ensures that the predicted probability ( ) strictly lies between 0 and 1. Thus, the Logit model satisfies the two conditions: 1). and 2). P i is non-linearly related with X i 10
The odds ratio When we use the Logit model, we limit the estimated probabilities inside the 0-1 range. But, while we are ensuring that the predicted probability of an event occurring (P i ) lies in the natural interval, 0 , we have created an estimation problem because P i is nonlinear in parameters and in explanatory variables and we cannot apply OLS. However, we can linearize this Logit model as follows. Take the ratio of the probability of an event occurring, in our case being employed, (P i ) to the probability of an event not happening (1-P i ) and the resulting ratio is called odds ratio. 11
Cont’d … = To linearize the above odds ratio, take the natural log of both the right side and left side equations. The resulting equation is called log of the odds ratio ( Logit ). Where, L i is the Logit (which is linearly related with X i ), is a matrix including all values for the explanatory variables and is a vector including all coefficients (the s). 12
4.3.1 Characteristics of the Logit Model Logit model, the predicted probability (P i ) lies in the natural limit, 0≤ Pi ≤ 1. Even if the Logit ( L) is linear in X , the probabilities themselves are not. This property is in contrast with the LPM model where the probabilities increase linearly with X . If the Logit (L) is positive, it means that, when the value of the regressor increases, the odds that the regressand equals 1 (meaning some event of interest happens) increases. If L is negative, the odds that the regressand equals 1 decreases as the value of X increases . Whereas the LPM assumes that Pi is linearly related to Xi, the Logit model assumes that the log of the odds ratio is linearly related to Xi 13
4.3.2 Estimation and interpretation of the Logit model The most common way to estimate binary response models is to use the method of maximum likelihood (MML) . ML estimation is maximizing the likelihood function with respect to the parameters. A parameter vector at which the likelihood takes on its maximum value is called a maximum likelihood estimate (MLE) of the parameters. Let’s first construct likelihood function, i.e., joint distributions. The likelihood contribution of observation i with is given by as a function of the unknown parameter vector , and similarly for . Assuming the observations are independent, the likelihood function (equation) for the entire sample is thus given by the joint probability. The joint density of the entire sample is just the product of the densities of the individual observations . 14
4.3.2 Estimation and interpretation of the Logit model That is, suppose we have a random sample of n observations. Letting denote the probability that or , the joint probability of observing the values, i.e., is given as: Where, . NB: The likelihood of a Bernoulli variable is the probability of success to the power of times the probability of failure to the power of . The joint probability given in the above equation is known as the likelihood function (LF). If we take natural logarithm of the equation, we obtain what is called the log likelihood function (LLF): 15
4.3.2 Estimation and interpretation of the Logit model In ML our objective is to maximize the LF (or LLF), that is, to obtain the values of the unknown parameters in such a manner that the probability of observing the given is as high (maximum) as possible. For this purpose, we differentiate the above equation partially with respect to each parameter, set the resulting expressions to zero and solve the resulting expressions. But the resulting expressions become highly nonlinear in the parameters and no explicit solutions can be obtained. However, the estimates of the parameters can easily be computed with the aid of Software packages such as EViews , Stata or any other Software Package. Thus , here under we shall estimate Logit and Probit models using Stata . Equally important is the interpretation of the results. 16
4.3.2 Estimation and interpretation of the Logit model Equally important is the interpretation of the results. Therefore , below we shall discuss on how to interpret the regression results of Binary Logit Models with the aid of numerical examples (estimated through the principle of Maximum Likelihood (MML) by Software packages). There are three ways of interpreting the regression results of a Logit model: A. Logit interpretation B. Odds ratio interpretation C. Probability interpretation (Marginal Effect Interpretation) For example, Assume that we want study the determinants of labor force participation (LFP) of adult men in a particular town. Suppose that we have data for 30 observations on labor force participation (employment), age and years of schooling. 17
4.3.2 Estimation and interpretation of the Logit model There are three regression results: A). Logit (logs of odds ratio) output & interpretation This interpretation is the direct interpretation of the coefficients of logs of odds ratio. 18
Cont’d … The value of implies that as one moves from unmarried to married individuals the logs of odds in favor of being employed decreases by 2.61. Means a unit change (a one year increase) in the explanatory variable (age) leads to a 0.0098 change (decrease) in the log-odds ratio in favor of success (being employed) keeping other things constant. Pseudo R-squared replacer of the R 2 used in linear regression. Pseudo R-squared compares the unrestricted log likelihood for the model we are estimating and the restricted log likelihood with only an intercept. 19
Cont’d … Next to the Pseudo R 2 , the Likelihood Ratio (LR) test can be used to judge if the whole model is significantly explaining the variation in the regress and (Y). The odds ratio interpretation The odds ratio is the ratio of the probability of success to failure . 20
Cont’d … Odds ratio greater than 1 means the probability of success is greater than the probability of failure vice versa. If odds ratio is 1, the probability of success is equal with the probability of failure. Example Age’s odds ratio is 0.99, which is less than one, imling that probability of success is less than probabilty of failure. This indicates that as age increases by one year, the odds ratio in favor of being employed is 0.99. 21
Probability interpretation (Marginal Effect Methods) This shows how the probability of success changes as the independent variable changes. As it is specified above 22
Cont’d … The above marginal effect after logit result shows the effect of each explanatory variable on the probability of being employed. Dy/dx of marriage = - 0.4849 indicates that keeping other things constant at average level, as one moves from unmarried to married, the probability of being employed increases by 48.49%. Similarly, dy /dx of age = -0.0021 indicates as age increase by one year then the probability of being employed decreases by 0.21% (assuming that all other explanatory variables score average). 23
4.4 T he P robit Regression (non-linear model) The probit model is similar to logit modeling, but to explain the behavior of dichotomous dependent variable (Y), it uses the normal cumulative distribution function as cumulative density function. Therefore, such model sometimes called the normit model As can be seen from the above graph, the logistic distribution has a flatter tail than the normal distribution. 24
Cont’d … 25 Logistic Distribution Function Pi Pi =1 Cumulative Normal Distribution Function G (Z) = Pi= G (Z) = Pi=
Cont’d … As can be seen from the above graph, the logistic distribution has a flatter tail than the normal distribution. This is because the variance of the logistic distribution ( ) is greater than the variance of the standard normal distribution (1). The difference between the coefficients of Logit model and that of Probit model is accounted to the difference in the variance of the two distributions. 26
4.5 Measuring Goodness-of-Fit in Logit and Probit Models Pseudo A goodness-of-fit measure is a summary statistic indicating the accuracy with which the model approximates the observed data. However , the conventional measure of goodness of fit, , is not particularly meaningful in binary regressand models. In the case in which the dependent variable is qualitative, accuracy can be judged either in terms of the fit between the calculated probabilities and observed response frequencies or in terms of the model's ability to forecast observed responses. Measures similar to , called pseudo , are available. Note , however, that contrary to the linear regression model, there is no single measure for the goodness-of-fit in binary choice models and a variety of measures exists. 27
4.5 Measuring Goodness-of-Fit in Logit and Probit Models The most common goodness of fit measure is the one that is proposed by McFadden (1974), which is defined as: Pseudo Where, as before is the log likelihood value you obtain from the unrestricted model, and is that generated by a regression (either Probit or Logit . But note that this goodness of fit measure is also used and reported often by statistical packages for any regression that involves MLE) with only the intercept. The Pseudo measures the fit using the likelihood function and measures the improvement in the value of the log likelihood relative to having no explanatory variables ( ). For instance, in our estimated Logit and Probit models above, we see that the . This suggests that the log-likelihood value increases by about 37% with the introduction of the set of regressors in the models. Similar to the of the linear regression model, it holds that . An increasing Pseudo may indicate a better fit of the model, whereas no simple interpretation like for the of the linear regression model is possible. 28
4.6. Hypothesis Testing: Joint Significance in Qualitative Response Regression Models As equivalent of the F test (in the linear regression model) there are various ways of testing multiple restrictions in Probit and Logit models. The most commonly used test, and most easily calculated test is the last, Likelihood Ratio Test. It is used when we wish to test our exclusion restrictions, i.e. whether we should or should not exclude a set of variables. The idea is a simple one, since what we are maximizing is the log likelihood function, and since as variables are excluded from the regression relationship, the objective function falls. The question then is whether we have a significant fall in the log likelihood function value. 29
4.6. Hypothesis Testing: Joint Significance in Qualitative Response Regression Models Like the F-test in linear regression models, in LR test the joint hypothesis to be tested is that all the explanatory variables are simultaneously irrelevant versus at least one of the regressors is relevant. The likelihood ratio statistic is just twice the difference in the log likelihood functions of the two models:-the unrestricted (the model with all regressors included) and the restricted (the model with only intercept term): where and are the log likelihoods for the unrestricted and restricted models. Given the null hypothesis, the LR statistic follows the distribution asymptotically with df equal to the number of explanatory variables. 30