Probit, Tobit Logit model dalam analisis

RenaYunita2 1 views 50 slides Oct 20, 2025
Slide 1
Slide 1 of 50
Slide 1
1
Slide 2
2
Slide 3
3
Slide 4
4
Slide 5
5
Slide 6
6
Slide 7
7
Slide 8
8
Slide 9
9
Slide 10
10
Slide 11
11
Slide 12
12
Slide 13
13
Slide 14
14
Slide 15
15
Slide 16
16
Slide 17
17
Slide 18
18
Slide 19
19
Slide 20
20
Slide 21
21
Slide 22
22
Slide 23
23
Slide 24
24
Slide 25
25
Slide 26
26
Slide 27
27
Slide 28
28
Slide 29
29
Slide 30
30
Slide 31
31
Slide 32
32
Slide 33
33
Slide 34
34
Slide 35
35
Slide 36
36
Slide 37
37
Slide 38
38
Slide 39
39
Slide 40
40
Slide 41
41
Slide 42
42
Slide 43
43
Slide 44
44
Slide 45
45
Slide 46
46
Slide 47
47
Slide 48
48
Slide 49
49
Slide 50
50

About This Presentation

ada


Slide Content

LOGIT, PROBIT, AND TOBIT MODEL 1

What is Regression Analysis ? Technique of estimating the unknown value of dependent variable from the known value of independent variable is called regression analysis. Eg : The effect of a price increase upon demand, or the effect of changes in the money supply upon the inflation rate 2

Regression Lines A regression line is a line that best describes the linear relationship between the two variables. y = a + bx 3 a Y= a+bX Y=a- bX X Axis Y Axis

Assumptions for regression Measurement : All independent variables –interval/ratio/dichotomous Dependent variable- interval/ratio Specification : Linear relationship between dependent and independent Expected value of error term : zero 4

Homoscedasticity Variance of error term is same/ constant Normality of error Normally distributed for each set of values of independent variable Absence of multicollinearity Assumptions for regression 5

Limitations of linear regression Violation of measurements Dependent variable : if it is dichotomous eg .: Smoker and non-smoker Adopter and non-adopter Participating and non-participating Independent variable: if any of the IV is dichotomous Eg: male and female 6

Shall we use LPM ???... yes but… Non-normality of the errors Ui Hetroscedastic variances of the errors Non fulfillment of 0 < E ( Yi|Xi ) < 1 Questionable of value of R 2 as a measure of goodness of fit 7

What is the way out ??? Logit Probit Tobit 8

Presentation on Logit, Probit and tobit Model Rabeesh Kumar Verma Roll no : 10756 Division of Agricultural Extension,ICAR - IARI 9

What is logistic regression ? Used to analyze relationships between a dichotomous dependent variable and metric or dichotomous independent variables Combines the independent variables to estimate the probability that a particular event will occur or not LR is a nonlinear regression model that forces the output (predicted values) to be either 0 or 1 It could be called a  qualitative response/discrete choice model in the terminology of  economics 10

Assumptions: NO NEED a linear relationship between the dependent and independent variables NO NEED- Homoscedasticity of independent variables The error terms need to be independent   It requires quite large sample sizes Absence of perfect multicollinearity NO NEED - normality, linearity, and homogeneity of variance for the independent variables 11

12 -∞ +∞ 1 P CDF

Feature of logit model: As P goes from 0 to 1 the logit L goes from -∞ to +∞. That is although probabilities lie between 0 to 1,logits are not so bounded. L is linear in X, the probabilities themselves are not, which is in contrast with LPM model where probabilities increases linearly with X. If L , the logit is positive, it means that when the value of the regressor (s) increases the odds that the regressand equals to 1 increases and vice versa. 13

Level of measurement: Logistic regression analysis requires that the dependent variable be dichotomous. Logistic regression analysis requires that the independent variables be metric or dichotomous. If an independent variable is nominal level and not dichotomous, the logistic regression procedure in SPSS has a option to dummy code the variable . If an independent variable is ordinal, we will attach the usual caution. 14

Variables in logistic regression : In typical logistic regression analysis there will always be one dependent (dichotomous) and Usually set of independent variable that may be either dichotomous or quantitative or some combination . 15

The minimum number of cases per independent variable is 10 (Hosmer and Lemeshow , Applied Logistic Regression ) For example- If we are using 8 independent variables, then minimum sample size should be = 8 x 10= 80 Sample size: Logit model and Probit model 16

Logistics regression equation Ln (Pi / (1-Pi)= a + b1x1 +b2x2+….+ bnXn Where, Pi= probability of happening of event eg : adoption of technology (1-Pi) = probability of not happening of the event eg : non-adoption of technology X1, X2…. Xn = independent variables b1, b2… bn = regression coefficients a= constant (intercept) 17

Example : Dependent variable Adoption / Non-adoption Independent variables Description Hypothesized relation Age Chronological years of farmers - Education No of years of formal schooling + Land holding Farm size measured acres + Access to training Yes=1 / no=0 + Distance to market In kilometers - Access to credit Yes=1 / no=0 + Extension services Yes=1 / no=0 + 18

Logit in SPSS 19

Logit in SPSS contd … 20

Logit model in stata :

Logit :predicted possibilities.

logit: Odd ratio

Case 1: A Logit Analysis of Bt Cotton Adoption and Assessment of Farmers’ Training Need Padaria, et al., 2009 24

Contd … Padaria, et al., 2009 B = regression coefficient Used to predict whether or not an independent variable would be significant in the model. degrees of freedom for the Wald chi- square test, Are the standard errors associated with the coefficients Wald chi square value and 2tailed p value used in testing the null hypothesis that the coefficient (parameter) is 0 Exp(B) the exponentiation of the B coefficient, which is an odds ratio . This value is given by default because odds ratios can be easier to interpret than the coefficient 25

Advantages of logit model : Transformation of a dependent dichotomous dependent variable into continuous variable Results - easily interpretable simple to analyse method. It gives parameter estimates - asymptotically consistent, efficient and normal, so that the analogue by the regression t-test can be applied. 26

Limitation: As in case of logit probility model, the disturbance term in logit model hetroscedasticity and therefore we should go for weighted least squares . As in many other regression , there may be problem of multicollinearity if the explanatory variable are related among themselves 27

Application of logit model: 1.It can be used to identify the factors that affects the adoption of particular technology say, use of new crop varities, fertilizers, pesticides etc on the farm . 2.Model used extensively to analyzing growth phenomena such as population, GNP, money supply etc . 3. In field of marketing it can be used for brand preferences and brand loyalty for a brand 4. Gender studies can be used logit analysis to find out factors which affect the decision making status of men and women in family 28

Probit regression model: Probit model  is a type of regression where the  dependent variable  can only take two values, for example adoption or non-adoption, married or not married. The purpose of the model is to estimate the probability  Estimating model that emerge from normal cumulative distribution function (CDF) is popularly known as probit model Sometimes it is also called as normit model. 29

Probit :Level of measurement requirements Dependent variable = dichotomous/categorical Eg: adoption and non adoption, participation and non- participation Independent variables be metric or dichotomous Eg: age-ratio level data Gender- male/female(dichotomous) 30

Case 2 : Factors Affecting Adoption of Improved Rice Varieties among Rural Farm Households in Central Nepal Ghimire (2015 ) (Published in : Rice Science) 31

Probit result cont … 32

Difference b/w Logit and Probit model: Logit Probit Slightly flatter tails The conditional probability Pi approaches 0 or 1 at a faster rate Basis of logit model is standard logistic distribution Basis of probit model is standard normal distribution Variance = Π 2 / 3 Variance = 1 Simple mathematics Sophisticated mathematics Both give same result, preference of the method depends on the researcher choice but logit regression is mostly preferered 33

Significance of Wald test To test Statistical significance of unique contribution of each coefficient in the model This test is similar to the t test in the multiple regression 34

Ordinal logit & probit model In both the cases - when the outcome is more than 2 and are ordinal in nature The dependent variables : Eg1: Likert type scale : strongly agree , somewhat agree, strongly disagree Eg2: less than high school (0), high school(1), college (2), post graduate (3) The independent variables remain same as in logit and probit model 35

Multi nominal logit and multi nominal probit When the dependent variable is not ordinal nature & the categories of dependent variables are more than 2. E.g. 1: adoption of different adaptation strategies Dependent variables =choice of transportation to work Eg2: occupation classification : unskilled, semiskilled, highly skilled 36

Multi nominal logit model Kassie et al. 2008 37 Dependent variable : compost , conservation tillage, both we have three categories i.e. > 2 categories

Tobit model An extension of probit model. Developed by James Tobin (Nobel laurate economist) Used when a sample in which information on the regressand is available only for some observation. Such sampled are called as censored sample. Therefore Tobit model is also know as censored regression model. Sometimes also called as limited dependent variable regression models 38

Conti.. Example: Suppose we have a set of consumer and we are interested in finding out the amount of money a person or family spends on a house in relation to socioeconomic variables. Here we have a dilemma … If a consumer does not purchase a house, obviously we have no data on housing expenditure for such consumers, we have such data only for the consumers; who actually purchase a house. 39

Thus, consumers are divided into two groups consisting of say n1 and n2 n1-about whom we have information on the regressor (say income, no.of people, mortagage interest rate ) as well as r egressand (amount of expenditure on house ) n2- about whom we have information only on the regressor but not on the regressand. Now questions arise ? 40

Can we estimates regression using only n1 observation and not worry about the remaining n2 observation. The answer is no.. For the OLS estimates of the parameters obtained from the subset of n1 observation will be biased as well as incosistent . 41

Statistically we can express tobit model as Yi= β 1 + β 2 Xi+Ui if RHS>0 = 0 Where RHS=right hand side Note : additional X variables can be easily added to the model. 42

Truncated sample : Distinguish from censored sample. In truncated sample information on the regreessor (IV) is available only if the regressand(DV) is observed. 43

If we estimate a regression line based on the n1 observation only, the resulting intercept and slope coefficients are bound to be different than if all the (n1+n2) observation were taken into account . 44

Mechanics of estimating tobit model: Tobit model are estimated by method of maximum likelihood . James Hackman has proposed alternative to ML which is comparatively easy. The Heckman procedure yields consistent estimates of the parameters but they are not as efficient as the ML estimates. 45

Nested regression analysis A nested model is one in which you incrementally add variables such that every subsequent model is a superset of the preceding one. For example, if y = a + bx is the first model, then the second model would be something like y = a + bx + cz +.... The advantage of this set-up is that it allows you to compare different specifications and ultimately investigate the relative importance of specific variables. 46

Note that a model is nested if and only if the next model contains the exact same terms in the preceding one and has at least one additional term. On the other hand, a two-stage model is one in which two equations are estimated one after the other with the second stage equation including a predicted value (usually the predicted outcome or residuals) from the first stage equation 47

Conclusion Clear on – About the assumption of different regression analysis model. Researcher should be well aware of the different model and used according to the defined research problem. Logit and probit model are being extensively used in health science, behavioral and social sciences. Models are extensively used in social research when dependent variable is dichotomous. 48

References : Meyers,L.S ., Gamst , G., & Guarino , A.J (2006). Applied Multivariate Research : Design And Interpretation Padaria et.al (2009) . A Logit Analysis Of Bt Cotton Adoption And Assessment of Farmer’s Training Need. Indian Res.J.Ext.Edu.9(2) Damodar et al . (2012). Basic econometrics. Mcgraw Hill Education , India 49

Thank you… 50
Tags