首页 计量经济学第三版教师手册CHAPTER 17

计量经济学第三版教师手册CHAPTER 17

举报
开通vip

计量经济学第三版教师手册CHAPTER 17CHAPTER 17 TEACHING NOTES I emphasize to the students that, first and foremost, the reason we use the probit and logit models is to obtain more reasonable functional forms for the response probability. Once we move to a nonlinear model with a fully specifie...

计量经济学第三版教师手册CHAPTER 17
CHAPTER 17 TEACHING NOTES I emphasize to the students that, first and foremost, the reason we use the probit and logit models is to obtain more reasonable functional forms for the response probability. Once we move to a nonlinear model with a fully specified conditional distribution, it makes sense to use the efficient estimation procedure, maximum likelihood. It is important to spend some time on interpreting probit and logit estimates. In particular, the students should know the rules-of-thumb for comparing probit, logit, and LPM estimates. Beginners sometimes mistakenly think that, because the probit and especially the logit estimates are much larger than the LPM estimates, the explanatory variables now have larger estimated effects on the response probabilities than in the LPM case. This may or may not be true. I view the Tobit model, when properly applied, as improving functional form for corner solution outcomes. In most cases it is wrong to view a Tobit application as a data-censoring problem (unless there is true data censoring in collecting the data or because of institutional constraints). For example, in using survey data to estimate the demand for a new product, say a safer pesticide to be used in farming, some farmers will demand zero at the going price, while some will demand positive pounds per acre. There is no data censoring here; some farmers find it optimal to use none of the new pesticide. The Tobit model provides more realistic functional forms for E(y|x) and E(y|y > 0,x) than a linear model for y. With the Tobit model, students may be tempted to compare the Tobit estimates with those from the linear model and conclude that the Tobit estimates imply larger effects for the independent variables. But, as with probit and logit, the Tobit estimates must be scaled down to be comparable with OLS estimates in a linear model. [See Equation (17.27); for an example, see Computer Exercise C17.3.] Poisson regression with an exponential conditional mean is used primarily to improve over a linear functional form for E(y|x). The parameters are easy to interpret as semi-elasticities or elasticities. If the Poisson distributional assumption is correct, we can use the Poisson distribution compute probabilities, too. But over​dispersion is often present in count regression models, and standard errors and likelihood ratio statistics should be adjusted to reflect this. Some reviewers of the first edition complained about either the inclusion of this material or its location within the chapter. I think applications of count data models are on the rise: in microeconometric fields such as criminology, health economics, and industrial organization, many interesting response variables come in the form of counts. One suggestion was that Poisson regression should not come between the Tobit model in Section 17.2 and Section 17.4, on censored and truncated regression. In fact, I put the Poisson regression model between these two topics on purpose: I hope it helps emphasize that the material in Section 17.2 is purely about functional form, as is Poisson regression. Sections 17.4 and 17.5 deal with underlying linear models, but where there is a data-observability problem. Censored regression, truncated regression, and incidental truncation are used for missing data problems. Censored and truncated data sets usually result from sample design, as in duration analysis. Incidental truncation often arises from self-selection into a certain state, such as employment or participating in a training program. It is important to emphasize to students that the underlying models are classical linear models; if not for the missing data or sample selection problem, OLS would be the efficient estimation procedure. SOLUTIONS TO PROBLEMS 17.1 (i) Let m0 denote the number (not the percent) correctly predicted when yi = 0 (so the prediction is also zero) and let m1 be the number correctly predicted when yi = 1. Then the proportion correctly predicted is (m0 + m1)/n, where n is the sample size. By simple algebra, we can write this as (n0/n)(m0/n0) + (n1/n)(m1/n1) = (1 ( )(m​0/n0) + (m1/n1), where we have used the fact that  = n1/n (the proportion of the sample with yi = 1) and 1 (  = n0/n (the proportion of the sample with yi = 0). But m0/n0 is the proportion correctly predicted when yi = 0, and m1/n1 is the proportion correctly predicted when yi = 1. Therefore, we have (m0 + m1)/n = (1 ( )(m0/n0) + (m1/n1). If we multiply through by 100 we obtain = (1 ( ) + ( , where we use the fact that, by definition,  = 100[(m0 + m1)/n],  = 100(m0/n0), and  = 100(m1/n1). (ii) We just use the formula from part (i):  = .30(80) + .70(40) = 52. Therefore, overall we correctly predict only 52% of the outcomes. This is because, while 80% of the time we correctly predict y = 0, yi = 0 accounts for only 30 percent of the outcomes. More weight (.70) is given to the predictions when yi = 1, and we do much less well predicting that outcome (getting it right only 40% of the time). 17.2 We need to compute the estimated probability first at hsGPA = 3.0, SAT = 1,200, and study = 10 and subtract this from the estimated probability with hsGPA = 3.0, SAT = 1,200, and study = 5. To obtain the first probability, we start by computing the linear function inside (((): (1.77 + .24(3.0) + .00058(1,200) + .073(10) = .376. Next, we plug this into the logit function: exp(.376)/[1 + exp(.376)]  .593. This is the estimated probability that a student-athlete with the given characteristics graduates in five years. For the student-athlete who attended study hall five hours a week, we compute –1.77 + .24(3.0) + .00058(1,200) + .073(5) = .011. Evaluating the logit function at this value gives exp(.011)/[1 + exp(.011)]  .503. Therefore, the difference in estimated probabilities is .593 ( .503 = .090, or just under .10. [Note how far off the calculation would be if we simply use the coefficient on study to conclude that the difference in probabilities is .073(10 – 5) = .365.] 17.3 (i) We use the chain rule and equation (17.23). In particular, let x1 ( log(z1). Then, by the chain rule, where we use the fact that the derivative of log(z1) is 1/z1. When we plug in (17.23) for (E(y|y > 0,x)/ (x1, we obtain the answer. (ii) As in part (i), we use the chain rule, which is now more complicated: where x1 = z1 and x2 = . But (E(y|y > 0,x)/ (x1 = (1{1 ( ((x(/()[x(/( + ((x(/()]}, (E(y|y > 0,x)/(x2 = (2{1 ( ((x(/()[x(/( + ((x(/()]}, (x1/(z1 = 1, and (x2/(z1 = 2z1. Plugging these into the first formula and rearranging gives the answer. 17.4 Since log(() is an increasing function – that is, for positive w1 and w2, w1 > w2 if and only if log(w1) > log(w2) – it follows that, for each i, mvpi > minwagei if and only if log(mvpi) > log(minwagei). Therefore, log(wagei) = max[log(mvpi), log(minwagei)]. 17.5 (i) patents is a count variable, and so the Poisson regression model is appropriate. (ii) Because (1 is the coefficient on log(sales), (1 is the elasticity of patents with respect to sales. (More precisely, (1 is the elasticity of E(patents|sales,RD) with respect to sales.) (iii) We use the chain rule to obtain the partial derivative of exp[(0 + (1log(sales) + (2RD + (3RD2] with respect to RD: = ((2 + 2(3RD)exp[(0 + (1log(sales) + (2RD + (3RD2]. A simpler way to interpret this model is to take the log and then differentiate with respect to RD: this gives (2 + 2(3RD, which shows that the semi-elasticity of patents with respect to RD is 100((2 + 2(3RD). 17.6 (i) OLS will be unbiased, because we are choosing the sample on the basis of an exogenous explanatory variable. The population regression function for sav is the same as the regression function in the subpopulation with age > 25. (ii) Assuming that marital status and number of children affect sav only through household size (hhsize), this is another example of exogenous sample selection. But, in the subpopulation of married people without children, hhsize = 2. Because there is no variation in hhsize in the subpopulation, we would not be able to estimate (2; effectively, the intercept in the subpopulation becomes (0 + 2(2, and that is all we can estimate. But, assuming there is variation in inc, educ, and age among married people without children (and that we have a sufficiently varied sample from this subpopulation), we can still estimate (1, (3, and (4. (iii) This would be selecting the sample on the basis of the dependent variable, which causes OLS to be biased and inconsistent for estimating the (j in the population model. We should instead use a truncated regression model. 17.7 For the immediate purpose of determining the variables that explain whether accepted applicants choose to enroll, there is not a sample selection problem. The population of interest is applicants accepted by the particular university, and you have a random sample from this population. Therefore, it is perfectly appropriate to specify a model for this group, probably a linear probability model, a probit model, or a logit model, and estimate the model using the data at hand. OLS or maximum likelihood estimation will produce consistent, asymptotically normal estimators. This is a good example of where many data analysts’ knee-jerk reaction might be to conclude that there is a sample selection problem, which is why it is important to be very precise about the purpose of the analysis, which requires one to clearly state the population of interest. If the university is hoping the applicant pool changes in the near future, then there is a potential sample selection problem: the current students that apply may be systematically different from students that may apply in the future. As the nature of the pool of applicants is unlikely to change dramatically over one year, the sample selection problem can be mitigated, if not entirely eliminated, by updating the analysis after each first-year class has enrolled. SOLUTIONS TO COMPUTER EXERCISES C17.1 (i) If spread is zero, there is no favorite, and the probability that the team we (arbitrarily) label the favorite should have a 50% chance of winning. (ii) The linear probability model estimated by OLS gives = .577 + .0194 spread (.028) (.0023) [.032] [.0019] n = 553, R2 = .111. where the usual standard errors are in (() and the heteroskedasticity-robust standard errors are in [(]. Using the usual standard error, the t statistic for H0: (0 = .5 is (.577 ( .5)/.028 = 2.75, which leads to rejecting H0 against a two-sided alternative at the 1% level (critical value  2.58). Using the robust standard error reduces the significance but nevertheless leads to strong rejection of H0 at the 2% level against a two-sided alternative: t = (.577 ( .5)/.032  2.41 (critical value  2.33). (iii) As we expect, spread is very statistically significant using either standard error, with a t statistic greater than eight. If spread = 10 the estimated probability that the favored team wins is .577 + .0194(10) = .771. (iv) The probit results are given in the following table: Dependent Variable: favwin Independent Variable Coefficient (Standard Error) spread .0925 (.0122) constant (.0106 (.1037) Number of Observations 553 Log Likelihood Value (263.56 Pseudo R-Squared .129 In the Probit model P(favwin = 1|spread) = (((0 + (1spread), where ((() denotes the standard normal cdf, if (0 = 0 then P(favwin = 1|spread) = (((1spread) and, in particular, P(favwin = 1|spread = 0) = ((0) = .5. This is the analog of testing whether the intercept is .5 in the LPM. From the table, the t statistic for testing H0: (0 = 0 is only about ‑.102, so we do not reject H0. (v) When spread = 10 the predicted response probability from the estimated probit model is ([‑.0106 + .0925(10)] = ((.9144)  .820. This is somewhat above the estimate for the LPM. (vi) When favhome, fav25, and und25 are added to the probit model, the value of the log-likelihood becomes –262.64. Therefore, the likelihood ratio statistic is 2[(262.64 – ((263.56)] = 2(263.56 – 262.64) = 1.84. The p-value from the distribution is about .61, so favhome, fav25, and und25 are jointly very insignificant. Once spread is controlled for, these other factors have no additional power for predicting the outcome. C17.2 (i) The probit estimates from approve on white are given in the following table: Dependent Variable: approve Independent Variable Coefficient (Standard Error) white .784 (.087) constant .547 (.075) Number of Observations 1,989 Log Likelihood Value (700.88 Pseudo R-Squared .053 As there is only one explanatory variable that takes on just two values, there are only two different predicted values: the estimated probabilities of loan approval for white and nonwhite applicants. Rounded to three decimal places these are .708 for nonwhites and .908 for whites. Without rounding errors, these are identical to the fitted values from the linear probability model. This must always be the case when the independent variables in a binary response model are mutually exclusive and exhaustive binary variables. Then, the predicted probabilities, whether we use the LPM, probit, or logit models, are simply the cell frequencies. (In other words, .708 is the proportion of loans approved for nonwhites and .908 is the proportion approved for whites.) (ii) With the set of controls added, the probit estimate on white becomes about .520 (se  .097). Therefore, there is still very strong evidence of discrimination against nonwhites. We can divide this by 2.5 to make it roughly comparable to the LPM estimate in part (iii) of Computer Exercise C7.8: .520/2.5  .208, compared with .129 in the LPM. (iii) When we use logit instead of probit, the coefficient (standard error) on white becomes .938 (.173). (iv) Recall that, to make probit and logit estimates roughly comparable, we can multiply the logit estimates by .625. The scaled logit coefficient becomes .625(.938)  .586, which is reasonably close to the probit estimate. A better comparison would be to compare the predicted probabilities by setting the other controls at interesting values, such as their average values in the sample. C17.3 (i) Out of 616 workers, 172, or about 18%, have zero pension benefits. For the 444 workers reporting positive pension benefits, the range is from $7.28 to $2,880.27. Therefore, we have a nontrivial fraction of the sample with pensiont = 0, and the range of positive pension benefits is fairly wide. The Tobit model is well-suited to this kind of dependent variable. (ii) The Tobit results are given in the following table: Dependent Variable: pension Independent Variable (1) (2) exper 5.20 (6.01) 4.39 (5.83) age (4.64 (5.71) (1.65 (5.56) tenure 36.02 (4.56) 28.78 (4.50) educ 93.21 (10.89) 106.83 (10.77) depends (35.28 (21.92) 41.47 (21.21) married (53.69 (71.73) 19.75 (69.50) white 144.09 (102.08) 159.30 (98.97) male 308.15 (69.89) 257.25 (68.02) union ––––– 439.05 (62.49) constant (1,252.43 (219.07) (1,571.51 (218.54) Number of Observations 616 616 Log Likelihood Value (3,672.96 (3648.55 677.74 652.90 In column (1), which does not control for union, being white or male (or, of course, both) increases predicted pension benefits, although only male is statistically significant (t  4.41). (iii) We use equation (17.22) with exper = tenure = 10, age = 35, educ = 16, depends = 0, married = 0, white = 1, and male = 1 to estimate the expected benefit for a white male with the given characteristics. Using our shorthand, we have = (1,252.5 + 5.20(10) – 4.64(35) + 36.02(10) + 93.21(16) + 144.09 + 308.15 = 940.90. Therefore, with  = 677.74 we estimate E(pension|x) as ((940.9/677.74)((940.9) + (677.74)(((940.9/677.74) 966.40. For a nonwhite female with the same characteristics, = (1,252.5 + 5.20(10) – 4.64(35) + 36.02(10) + 93.21(16) = 488.66. Therefore, her predicted pension benefit is ((488.66/677.74)((488.66) + (677.74)(((488.66/677.74) 582.10. The difference between the white male and nonwhite female is 966.40 – 582.10 = $384.30. [Instructor’s Note: If we had just done a linear regression, we would add the coefficients on white and male to obtain the estimated difference. We get about 114.94 + 272.95 = 387.89, which is very close to the Tobit estimate. Provided that we focus on partial effects, Tobit and a linear model often give similar answers for explanatory variables near the mean values.] (iv) Column (2) in the previous table gives the results with union added. The coefficient is large, but to see exactly how large, we should use equation (17.22) to estimate E(pension|x) with union = 1 and union = 0, setting the other explanatory variables at interesting values. The t statistic on union is over seven. (v) When peratio is used as the dependent variable in the Tobit model, white and male are individually and jointly insignificant. The p-value for the test of joint significance is about .74. Therefore, neither whites nor males seem to have different tastes for pension benefits as a fraction of earnings. White males have higher pension benefits because they have, on average, higher earnings. C17.4 (i) The results for the Poisson regression model that includes pcnv2, ptime862, and inc862 are given in the following table: Dependent Variable: narr86 Independent Variable Coefficient (Standard Error) pcnv 1.15 (0.28) avgsen (.026 (.021) tottime .012 (.016) ptime86 .684 (.091) qemp86 .023 (.033) inc86 (.012 (.002) black .591 (.074) hispan .422 (.075) born60 (.093 (.064) pcnv2 (1.80 (0.31) ptime862 (.103 (.016) inc862 .000021 (.000006) constant (.710 (.070) Number of Observations 2,725 Log Likelihood Value (2,168.87 1.179 (ii)  = (1.179)2  1.39, and so there is evidence of overdispersion. The maximum likelihood standard errors should be multiplied by , which is about 1.179. Therefore, the MLE standard errors should be increased by about 18%. (iii) From Table 17.3 we have the log-likelihood value for the restricted model, Lr = (2,248.76. The log-likelihood value for the unrestricted model is given in the above table as –2,168.87. Therefore, the usual likelihood ratio statistic is 159.78. The quasi-likelihood ratio statistic is 159.78/1.39  114.95. In a distribution this gives a p-value of essentially zero. Not surprisingly, the quadratic terms are jointly very significant. C17.5 (i) The Poisson regression results are given in the following table: Dependent Variable: kids Independent Variable Coefficient Standard Error educ (.048 .007 age .204 .055 age2 (.0022 .0006 black .360 .061 east .088 .053 northcen .142 .048 west .080 .066 farm (.015 .058 othrural (.057 .069 town .031 .049 smcity .074 .062 y74 .093 .063 y76 (.029 .068 y78 (.016 .069 y80 (.020 .069 y82 (.193 .067 y84 (.214 .069 constant (3.060 1.211 n = 1,129 L = (2,070.23 = .944 The coefficient on y82 means that, other factors in the model fixed, a woman’s fertility was about 19.3% lower in 1982 than in 1972. (ii) Because the coefficient on black is so large, we obtain the estimated proportionate difference as exp(.36) – 1  .433, so a black woman has 43.3% more children than a comparable nonblack woman. (Notice also that black is very statistically significant.) (iii) From the above table,  = .944, which shows that there is actually under​dispersion in the estimated model. (iv) The sample correlation between kidsi and is about .348, which means the R-squared (or, at least one version of it), is about (.348)2  .121. Interestingly, this is actually smaller than the R-squared for the linear model estimated by OLS. (However, remember that OLS obtains the highest possible R-squared for a linear model, while Poisson regression does not obtain the highest possible R-squared for an exponential regression model.) C17.6 The results of an OLS regression using only the uncensored durations are given in the following table. Dependent Variable: log(durat) Independent Variable Coefficient (Standard Error) workprg .092 (.083) priors (.048 (.014) tserved (.0068 (.0019) felon .119 (.103) alcohol (.218 (.097) drugs .018 (.089) black (.00085 (.08221) married .239 (.099) educ (.019 (.019) age .00053 (.00042) constant 3.001 (0.244) Number of Observations 552 R-Squared .071 There are several important differences between the OLS estimates using the uncensored durations and the estimates from the censored regression in Table 17.4. For example, the binary indicator for drug usage, drugs, has become positive and insignificant, whereas it was negative (as we expect) and significant in Table 17.4. On the other hand, the work program dummy, workprg, becomes positive but is still insignificant. The remaining coefficients maintain the same sign, but they are all attenuated toward zero. The apparent attenuation bias of OLS for the coefficient on black is especially severe, where the estimate changes from (.543 in the (appropriate) censored regression estimation to (.00085 in the (inappropriate) OLS regression using only the uncensored durations. C17.7 (i) When log(wage) is regressed on educ, exper, exper2, nwifeinc, age, kidslt6, and kidsge6, the coefficient and standard error on educ are .0999 (se = .0151). (ii) The Heckit coefficient on educ is .1187 (se = .0341), where the standard error is just the usual OLS standard error. The estimated return to education is somewhat larger than without the Heckit corrections, but the Heckit standard error is over twice as large. (iii) Regressing on educ, exper, exper2, nwifeinc, age, kidslt6, and kidsge6 (using only the selected sample of 428) produces R2  .962, which means that there is substantial multicollinearity among the regressors in the second stage regression. This is what leads to the large standard errors. Without an exclusion restriction in the log(wage) equation, is almost a linear function of the other explanatory variables in the sample. C17.8 (i) 185 out of 445 participated in the job training program. The longest time in the experiment was 24 months (obtained from the variable mosinex). (ii) The F statistic for joint significance of the explanatory variables is F(7,437) = 1.43 with p-value = .19. Therefore, they are jointly insignificant at even the 15% level. Note that, even though we have estimated a linear probability model, the null hypothesis we are testing is that all slope coefficients are zero, and so there is no heteroskedasticity under H0. This means that the usual F statistic is asymptotically valid. (iii) After estimating the model P(train = 1|x) = (((0 + (1unem74 + (2unem75 + (3age + (4educ + (5black + (6hisp + (7married) by probit maximum likelihood, the likelihood ratio test for joint significance is 10.18. In a distribution this gives p-value = .18, which is very similar to that obtained for the LPM in part (ii). (iv) Training eligibility was randomly assigned among the participants, so it is not surprising that train appears to be independent of other observed factors. (However, there can be a difference between eligibility and actual participation, as men can always refuse to participate if chosen.) (v) The simple LPM results are = .354 ( .111 train (.028) (.044) n = 445, R2 = .014 Participating in the job training program lowers the estimated probability of being unemployed in 1978 by .111, or 11.1 percentage points. This is a large effect: the probability of being unemployed without participation is .354, and the training program reduces it to .243. The differences is statistically significant at almost the 1% level against at two-sided alternative. (Note that this is another case where, because training was randomly assigned, we have confidence that OLS is consistently estimating a causal effect, even though the R-squared from the regression is very small. There is much about being unemployed that we are not explaining, but we can be pretty confident that this job training program was beneficial.) (vi) The estimated probit model is (((.375 ( .321 train) (.080 (.128) where standard errors are in parentheses. It does not make sense to compare the coefficient on train for the probit, (.321, with the LPM estimate. The probabilities have different functional forms. However, note that the probit and LPM t statistics are essentially the same (although the LPM standard errors should be made robust to heteroskedasticity). (vii) There are only two fitted values in each case, and they are the same: .354 when train = 0 and .243 when train = 1. This has to be the case, because any method simply delivers the cell frequencies as the estimated probabilities. The LPM estimates are easier to interpret because they do not involve the transformation by (((), but it does not matter which is used provided the probability differences are calculated. (viii) The fitted values are no longer identical because the model is not saturated, that is, the explanatory variables are not an exhaustive, mutually exclusive set of dummy variables. But, because the other explanatory variables are insignificant, the fitted values are highly correlated: the LPM and probit fitted values have a correlation of about .993. C17.9 (i) 248. (ii) The distribution is not continuous: there are clear focal points, and rounding. For example, many more people report one pound than either two-thirds of a pound or 1 1/3 pounds. This violates the latent variable formulation underlying the Tobit model, where the latent error has a normal distribution. Nevertheless, we should view Tobit in this context as a way to possibly improve functional form. It may work better than the linear model for estimating the expected demand function. (ii) The following table contains the Tobit estimates and, for later comparison, OLS estimates of a linear model: Dependent Variable: ecolbs Independent Variable Tobit OLS (Linear Model) ecoprc (5.82 (.89) (2.90 (.59) regprc 5.66 (1.06) 3.03 (.71) faminc .0066 (.0040) .0028 (.0027) hhsize .130 (.095) .054 (.064) constant 1.00 (.67) 1.63 (.45) Number of Observations 660 660 Log Likelihood Value (1,266.44 ((( 3.44 2.48 R-squared .0369 .0393 Only the price variables, ecoprc and regprc, are statistically significant at the 1% level. (iv) The signs of the price coefficients accord with basic demand theory: the own-price effect is negative, the cross price effect for the substitute good (regular apples) is positive. (v) The null hypothesis can be stated as H0: (1 + (2 = 0. Define (1 = (1 + (2. Then (.16. To obtain the t statistic, I write (2 = (1 ( (1, plug in, and rearrange. This results in doing Tobit of ecolbs on (ecoprc ( regprc), regprc, faminc, and hhsize. The coefficient on regprc is and, of course we get its standard error: about .59. Therefore, the t statistic is about (.27 and p-value = .78. We do not reject the null. (vi) The smallest fitted value is .798, while the largest is 3.327. (vii) The squared correlation between ecolbsi and is about .0369. This is one possible R-squared measure. (viii) The linear model estimates are given in the table for part (ii). The OLS estimates are smaller than the Tobit estimates because the OLS estimates are estimated partial effects on E(ecolbs|x), whereas the Tobit coefficients must be scaled by the term in equation (17.27). The scaling factor is always between zero and one, and often substantially less than one. The Tobit model does not fit better, at least in terms of estimating E(ecolbs|x): the linear model R-squared is a bit larger (.0393 versus .0369). (ix) This is not a correct statement. We have another case where we have confidence in the ceteris paribus price effects (because the price variables are exogenously set), yet we cannot explain much of the variation in ecolbs. The fact that demand for a fictitious product is hard to explain is not very surprising. [Instructor’s Notes: This might be a good place to remind students about basic economics. You can ask them whether reglbs should be included as an additional explanatory variable in the demand equation for ecolbs, making the point that the resulting equation would no longer be a demand equation. In other words, reglbs and ecolbs are jointly determined, but it is not appropriate to write each as a function of the other. You could have the students compute heteroskedasticity-robust standard errors for the OLS estimates. Also, you could have them estimate a probit model for ecolbs = 0 versus ecolbs > 0, and have them compare the scaled Tobit slope estimates with the probit estimates.] C17.10 (i) 497 people do not smoke at all. 101 people report smoking 20 cigarettes a day. Since one pack of cigarettes contains 20 cigarettes, it is not surprising that 20 is a focal point. (ii) The Poisson distribution does not allow for the kinds of focal points that characterize cigs. If you look at the full frequency distribution, there are blips at half a pack, two packs, and so on. The probabilities in the Poisson distribution have a much smoother transition. Fortunately, the Poisson regression model has nice robustness properties. (iii) The results of the Poisson regression are given in the following table, along with the OLS estimates of a linear model for later reference. The Poisson standard errors are the usual Poisson maximum likelihood standard errors, and the OLS standard errors are the usual (nonrobust) standard errors. Dependent Variable: cigs Independent Variable Poisson (Exponential Model) OLS (Linear Model) log(cigpric) (.355 (.144) (2.90 (5.70) log(income) .085 (.020) .754 (.730) white (.0019 (.0372) (.205 (1.458) educ (.060 (.004) (.514 (.168) age .115 (.005) .782 (.161) age2 (.00138 (.00006) (.0091 (.0018) constant 1.46 (.61) 5.77 (24.08) Number of Observations 807 807 Log Likelihood Value (8,184.03 ((( 4.54 13.46 R-squared .043 .045 The estimated price elasticity is (.355 and the estimated income elasticity is .085. (iv) If we use the maximum likelihood standard errors, the t statistic on log(cigpric) is about (2.47, which is significant at the 5% level against a two-sided alternative. The t statistic on log(income) is 4.25, which is very significant. (v) 20.61, and so 4.54. This is evidence of severe overdispersion, and means that all of the standard errors for Poisson regression should be multiplied by 4.54; the t statistics should be divided by 4.54. (vi) The robust t statistic for log(cigpric) is about (.54, which makes it very insignificant. This is a good example of misleading the usual Poisson standard errors and test statistics can be. The robust t statistic for log(income) is about .94, which also makes the income elasticity statistically insignificant. (vii) The education and age variables are still quite significant; the robust t statistic on educ over three in absolute value, and the robust t statistic on age is over five. The coefficient on educ implies that one more year of education reduces the expected number of cigarettes smoked by about 6.0%. (viii) The minimum predicted value is .515 and the maximum is 18.84. The fact that we predict some smoking for anyone in the sample is a limitation with using the expected value for prediction. Further, we do not predict that anyone will smoke even one pack of cigarettes, even though more than 25% of the people in the sample report smoking a pack or more per day! This shows that smoking, especially heavy smoking, is difficult to predict based on the explanatory variables we have access to. (ix) The squared correlation between cigsi and is the R-squared reported in the above table, .043. (x) The linear model results are reported in the last column of the previous table. The R-squared is slightly higher for the linear model – but remember, the OLS estimates are chosen to maximize the R-squared, while the MLE estimates do not maximize the R-squared (as we have calculated it). In any case, both R-squareds are quite small. C17.11 (i) The fraction of women in the work force is 3,286/5,634 ( .583. (ii) The OLS results using the selected sample are = .649 + .099 educ + .020 exper ( .00035 exper2 (.060) (.004) (.003) (.00008) ( .030 black + .014 hispanic (.034) (.036) n = 3,286, R2 = .205 While the point estimates imply blacks earn, on average, about 3% less and Hispanics about 1.3% more than the base group (non-black, non-Hispanic), neither coefficient is statistically significant – or even very close to statistical significance at the usual levels. The joint F test gives a p-value of about .63. So, there is little evidence for differences by race and ethnicity once education and experience have been controlled for. (iii) The coefficient on nwifeinc is (.0091 with t = (13.47 and the coefficient on kidlt6 is (.500 with t = (11.05. We expect both coefficients to be negative. If a woman’s spouse earns more, she is less likely to work. Having a young child in the family also reduces the probability that the woman works. Each variable is very statistically significant. (Not surprisingly, the joint test also yields a p-value of essentially zero.) (iv) We need at least one variable to affect labor force participation that does not have a direct effect on the wage offer. So, we must assume that, controlling for education, experience, and the race/ethnicity variables, other income and the presence of a young children do not affect wage. These propositions could be false if, say, employers discriminate against women who have young children or whose husbands work. Further, if having a young child reduces productivity – through, say, having to take time off for sick children and appointments – then it would be inappropriate to exclude kidlt6 from the wage equation. (v) The t statistic on the inverse Mills ratio is 1.77 and the p-value against the two-sided alternative is .077. With 3,286 observations, this is not a very small p-value. The test on does not provide strong evidence against the null hypothesis of no selection bias. (vi) Just as important, the slope coefficients do not change much when the inverse Mills ratio is added. For example, the coefficient on educ increases from .099 to .103 – a change within the 95% confidence interval for the original OLS estimate. [The 95% CI is (.092,.106.)]. The changes on the experience coefficients are also pretty small; the Heckman estimates are well within the 95% confidence intervals of the OLS estimates. Superficially, the black and hispanic coefficients change by larger amounts, but these estimates are statistically insignificant. Based on the wide confidence intervals, we expect rather wide changes in the estimates to even minor changes in the specification. The most substantial change is in the intercept estimate – from .649 to .539 – but it is hard to know what to make of this. Remember, in this example, the intercept is the estimated value of log(wage) for a non-black, non-Hispanic woman with zero years of education and experience. No one in the full sample even comes close to this description. Because the slope coefficients do change somewhat, we cannot say that the Heckman estimates imply a lower average wage offer than the uncorrected estimates. Even if this were true, the estimated marginal effects of the explanatory variables are hardly affected. PAGE 181 _1090068968.unknown _1184000691.unknown _1184001064.unknown _1184001394.unknown _1184002253.unknown _1184135671.unknown _1184001076.unknown _1184000953.unknown _1090095793.unknown _1090140849.unknown _1090140943.unknown _1090142834.unknown _1090096227.unknown _1090092316.unknown _1090067816.unknown _1090068200.unknown _1090068939.unknown _1090067861.unknown _1088177352.unknown _1088182169.unknown _1088185492.unknown _1088178829.unknown _1088182162.unknown _1088177316.unknown _1088177291.unknown _1087485511.unknown _1076683926.unknown _1085726151.unknown _1076683435.unknown
本文档为【计量经济学第三版教师手册CHAPTER 17】,请使用软件OFFICE或WPS软件打开。作品中的文字与图均可以修改和编辑, 图片更改请在作品中右键图片并更换,文字修改请直接点击文字进行修改,也可以新增和删除文档中的内容。
该文档来自用户分享,如有侵权行为请发邮件ishare@vip.sina.com联系网站客服,我们会及时删除。
[版权声明] 本站所有资料为用户分享产生,若发现您的权利被侵害,请联系客服邮件isharekefu@iask.cn,我们尽快处理。
本作品所展示的图片、画像、字体、音乐的版权可能需版权方额外授权,请谨慎使用。
网站提供的党政主题相关内容(国旗、国徽、党徽..)目的在于配合国家政策宣传,仅限个人学习分享使用,禁止用于任何广告和商用目的。
下载需要: 免费 已有0 人下载
最新资料
资料动态
专题动态
is_118783
暂无简介~
格式:doc
大小:252KB
软件:Word
页数:0
分类:金融/投资/证券
上传时间:2018-09-10
浏览量:14