Breusch-Pagan Test 的原始文章（英文计量经济学文献）

Breusch-Pagan Test 的原始文章（英文计量经济学文献） A Simple Test for Heteroscedasticity and Random Coefficient Variation Author(s): T. S. Breusch and A. R. Pagan Source: Econometrica, Vol. 47, No. 5 (Sep., 1979), pp. 1287-1294 Published by: The Econometric Society Stable URL: http://www.jstor.org/stable/19119...

A Simple Test for Heteroscedasticity and Random Coefficient Variation Author(s): T. S. Breusch and A. R. Pagan Source: Econometrica, Vol. 47, No. 5 (Sep., 1979), pp. 1287-1294 Published by: The Econometric Society Stable URL: http://www.jstor.org/stable/1911963 . Accessed: 27/04/2013 15:04 Your use of the JSTOR archive indicates your acceptance of the Terms & Conditions of Use, available at . http://www.jstor.org/page/info/about/policies/terms.jsp . JSTOR is a not-for-profit service that helps scholars, researchers, and students discover, use, and build upon a wide range of content in a trusted digital archive. We use information technology and tools to increase productivity and facilitate new forms of scholarship. For more information about JSTOR, please contact support@jstor.org. . The Econometric Society is collaborating with JSTOR to digitize, preserve and extend access to Econometrica. http://www.jstor.org This content downloaded from 141.164.71.238 on Sat, 27 Apr 2013 15:04:51 PM All use subject to JSTOR Terms and Conditions Econometrica, Vol. 47, No. 5 (September, 1979) A SIMPLE TEST FOR HETEROSCEDASTICITY AND RANDOM COEFFICIENT VARIATION BY T. S. BREUSCH AND A. R. PAGAN A simple test for heteroscedastic disturbances in a linear regression model is developed using the framework of the Lagrangian multiplier test. For a wide range of heteroscedastic and random coefficient specifications, the criterion is given as a readily computed function of the OLS residuals. Some finite sample evidence is presented to supplement the general asymptotic properties of Lagrangian multiplier tests. 1. INTRODUCTION IN SOME APPLICATIONS of the general linear model, the usual assumptions of homoscedastic disturbances and fixed coefficients may be questioned. When these requirements are not met, the loss in efficiency in using ordinary least squares (OLS) may be substantial and, more importantly, the biases in estimated standard errors may lead to invalid inferences. This has caused a number of writers to propose models which relax these conditions and to devise estimators for their more general specifications, e.g., Goldfeld and Quandt [8] for heteroscedasticity and Hildreth and Houck [11] for random coefficients. However, because the effect of introducing random coefficient variation is to give the dependent variable a different variance at each observation, models with this feature can be considered as particular heteroscedastic formulations for the purpose of detecting departure from the standard linear model. A test for heteroscedasticity with the same asymptotic properties as the likelihood ratio test in standard situations, but which can be computed by two least squares regressions, thereby avoiding the iterative calculations necessary to obtain maximum likelihood estimates of the parameters in the full model, is considered in this paper. The approach is based on the Lagrangian multiplier (LM) test of Aitchison and Silvey [1, 20] which is also known as Rao's efficient score test [18, p. 417]. This statistic is obtained from the results of maximizing the likelihood subject to the parameter constraints implied by the null hypothesis and can be computed either from the Lagrangian multipliers corresponding to the constraints as in [1] or from the first order conditions as in [18].2 Asymptotic equivalence of the LM test with the likelihood ratio procedure is shown in some detail by Silvey [20]. The test proposed in this paper is "constructive" in the sense of [9, p. 85] because a specific form of heteroscedasticity is distinguished as the alternative to the null hypothesis of homoscedasticity. However, it will be seen that the same LM statistic is appropriate for a fairly wide class of alternative hypotheses. There are four sections to the paper. In Section 2 the general framework is set out and the statistic is derived, Section 3 considers finite sample properties, and general comments are made in the concluding Section 4. I We would like to thank unknown referees for their comments. 2Similar ideas have been used in other areas, e.g., the-Durbin [5] h-statistic for autocorrelation in models with lagged dependent variables as regressors can be derived as an LM statistic. 1287 This content downloaded from 141.164.71.238 on Sat, 27 Apr 2013 15:04:51 PM All use subject to JSTOR Terms and Conditions 1288 T. S. BREUSCH AND A. R. PAGAN 2. THE TEST STATISTIC Consider the linear model (1) ~ ~ ytX/O+ Ut (t = 1, . .. , N) where /3 is a (k x 1) vector of coefficient parameters and the disturbances ut are normally and independently distributed with mean zero and variance (2) 2 =h (z a). Here function h(.), which is not indexed by t, is assumed to possess first and second derivatives, a is a (p x 1) vector of unrestricted parameters functionally unrelated to the ,B coefficients, and the first element in zt is unity. This allows the null hypothesis of homoscedasticity to be written as Ho: a2= . ap =0 for then z ta = a1 so that o_ = h (a,) = o-2 is constant. It is also assumed that xt and zt are exogenous, obeying the conditions set out in Amemiya [2]. The representation in (2) is sufficiently general to include most of the hetero- scedastic models distinguished in the literature. These are usually either 2 Olt = exp (z a) which has been shown by Harvey [10] to encompass the specifications of [6, 15 and 16], or Olt = (zla)m with m a prespecified integer as in [7, 8, and 19]. The random coefficient model of Hildreth and Houck [11] and most of its later generalizations, for example the one considered by Swamy and Mehta [21], are of the form ao = zta where the ' 3 elements of zt are obtained from the distinct elements of xtx t. Define the OLS residuals from (1) as u^t and the estimated residual variance as .A2 11 2 ar2 = N-1 > u^2. This allows our basic result to be stated in the following theorem. THEOREM: For the model (1) and (2) under the conditions given above, the Lagrangian multiplier statistic for testing Ho: a2 = ... = ap =0 (homoscedastic disturbances) can be found as one half the explained sum of squares in a regression ^ -2 . 2 2 of gt = - u t upon zt and is asymptotically distributed as X with (p - 1) degrees of freedom when the null hypothesis is true.4 3 Note that there are some heteroscedastic models which do not fit our general formulation, e.g., (rt2 = aZt (scalar a and zt) and x-2 o[E(yt)]2. But these specifications do not provide a convenient framework for testing homoscedasticity. In the first example, there is no parametric restriction which gives the null hypothesis as a special case of the general model, and in the second there is no regression without heteroscedasticity because of the implied relationship between a and d. 4 The statistic is defined as a regression result to give a convenient method of computation which is not meant to imply that the usual criteria for "goodness of fit" of this regression have any meaningful properties. This content downloaded from 141.164.71.238 on Sat, 27 Apr 2013 15:04:51 PM All use subject to JSTOR Terms and Conditions TEST FOR HETEROSCEDASTICITY 1289 PROOF: Let 1(6) be a log likelihood depending on a vector of parameters 6 with d = al/ao as the first derivative (score) vector and J = -E(a21/caOM') as the information matrix. Then following Rao [18, pp. 418-419] the LM statistic for testing the null hypothesis represented by parametric constraints q5(0) = 0 is given by LM= '-_1d where the hats indicate that the quantities are evaluated with 0, the restricted maximum likelihood estimate satisfying k (0) = 0. For the case where 0' = (O' : 06) and the constraints refer to only one of the subsets, say X (02) = 0, the vector d may be partitioned conformably as d'= (d : d') so that d1 =0 from constrained maximum likelihood. If, furthermore, the information matrix is block diagonal between 01 and 62, 021 = -E(G21/a 02a6l) = 0, the statistic becomes LM =d 22d2 where J22 =-E(d l/d602a602 ) For the model given by (1) and (2) the log likelihood is 1(f3, a) = -2N log (2XT)-2 - log o_2_2 E St 2 (yt _X/3)2 t t where a-t = h(z'a). The first derivative with respect to the a parameters is dof =dlda =2 h(St)Zt(0_t 2t-2t da,, al/aa =E h'( ,ut4u 2)o where st = zta and h'(st) = ah(st)/ast. It is easily seen that S( = -E(a21/ aa a') = 0 so that the LM statistics for testing Ho will be d 5qlda where constrained maximum likelihood corresponds to OLS applied to (1). Evaluating the required quantities gives d, = 2[c-2h'(ac)] Z z U( 2at -1), a = 2[5-2h '(a 1)]2 E ZtZ , and the statistic is (3) LM= 2 Eztt ( ztz' ) (ztf where ft = (21ut2 -1) = g- 1. Alternatively, collect all N observations by defining Z = (z1, . . ., ZN)', f = (fl, . , fN)', g = (gi, . . . , gN)', and i as an (N x 1) vector of units. Then f = (g - i), i'g = N, i'f = 0, and f'Z(Z'Z)-'Z'i = 0 because i is the first column of Z. Thus LM = 2fZ(Z'Z)'Z'f - 2[g'Z (Z'Z) -Z'g -N-(i'g)2] This content downloaded from 141.164.71.238 on Sat, 27 Apr 2013 15:04:51 PM All use subject to JSTOR Terms and Conditions 1290 T. S. BREUSCH AND A. R. PAGAN which is one half of the explained sum of squares in the regression of gt upon zt (see Goldberger [12, p. 165, eq. (4.21)]). -,2 A 2 2 Let u = o- g, let S be the explained sum of squares from the regression of u2 against Z, and let S be the OLS estimates of the coefficients (8) in this regression. From Amemiya [2, eqs. (5) and (6)] it follows that, under the null hypothesis, N(8-8)->N(O, 2u4(N'Z'ZF') so that (2o4S->Ke i as in standard least ,4)- A-4 4. squares theory. As LM = (20 )-'S and o-r o r in probability under Ho, it follows that LM P-_1 in distribution. 3. FINITE SAMPLE PROPERTIES As with all procedures developed from asymptotic principles, it is desirable to investigate the properties of the statistic based on a finite amount of data. It is difficult to establish the exact small sample distribution by analytical methods as the LM statistic is the ratio of quadratic forms in the OLS residuals u2t, which are dependent gamma variables, and very little work has been done in analyzing such distributions. However, as the distribution of the LM statistic under Ho can be shown to be independent of any unknown parameters, the Type I error can always be evaluated in the context of a particular model by Monte Carlo methods (to any desired degree of accuracy). It is useful to explore the finite sample properties in the case p =2, i.e. r2= h(ai+a2z2,t), in some detail. The LM statistic is then LM = E (Z2t2)2[ (Z2t - Z2)ft] Z (z2,t -Z2) [Z(Z2,t -Z2)gt = (u'Du/u u) where 2= N= z2,t, 2u = (i,.U . , UN)', and D is a diagonal matrix with ith diagonal element {N(z2,i - 22)/[2 Et (Z2,t - Z2)2])} for i = 1,. . ., N. Thus for any c >0, pr {LM > c} = pr {u'Du/' > Vc} + pr {u'Du/u'u <-Vc}. Because each of these terms involves the ratio of quadratic forms in normal variables, Imhof's procedure [13] might be used to compute the exact prob- abilities (see Koerts and Abrahamse [14], for a good discussion of this). The disadvantage with this method is that it does not extend beyond p = 2 and an alternative that covers any p is desirable. Such an alternative is available by observing that division of both numerator and denominator by O-2 does not affect LM but results in u^ being a linear transformation of the standard normal deviates ur 1 Ut. Thus, assuming an investigator has computed a value of LM = c from some model, the exact probability of a Type I error for this value can be estimated by (a) generating n sets of N observations on an n.i.d.(0, 1) variable ft, (b) forming the statistic as in (3) using u^ = (I - X(X'X) 1X')+, (c) observing the number of times, This content downloaded from 141.164.71.238 on Sat, 27 Apr 2013 15:04:51 PM All use subject to JSTOR Terms and Conditions TEST FOR HETEROSCEDASTICITY 1291 r, that LM > c in these n sets, (d) using rln as the estimate of the probability of a Type I error. The difference between r/n and the exact probability tends to zero as n -*cx0, so that this difference can be made arbitrarily small with high probability, and good results seem likely for n = 5000. Although this may seem expensive it should be borne in mind that, even when p = 2 and Imhof's method is available, exact computation via inversion of the characteristic function involves two numerical integrations, and these also exhibit errors that may be as high as those from simulation if the truncation point is too small or the grid is too large. Even though there is a way of computing the exact probability of Type I errors for the statistic, it seems likely that most researchers would only use this if the computed LM lay near the critical 5 per cent or 10 per cent significance points of the Xp_ and it therefore seems worthwhile assessing the adequacy of such a strategy for p = 2 with a particular set of data. Such a choice also enables a comparison of the simulation and Imhof methods, both in terms of accuracy and computational cost. Essentially, there are three important questions to be asked concerning the small sample distribution: How adequate is the x2 approximation as an indicator of significance levels? What is the power of the test statistic to reject a false null hypothesis of homoscedasticity? How robust is the test to a misspecified model? In the following experiments attention is centered upon the first two questions. Rutemiller and Bowers [19] investigated heteroscedasticity in two regression models and it was decided to select the data from these models (but not the same heteroscedastic model) for the experiments. Model I utilizes the "radio set" data while Model II works with the "auto stopping distance" data. Accordingly, there are a maximum of forty-nine observations for Model I and sixty-three for Model II. Having selected the data it is necessary to decide on a form for the hetero- scedasticity. The random coefficient model yt = a +,1txt + ut, ft =1 + e, which 2 2 2 2 implies 0t> = aou +xt(o, was selected so that Z2,t in our experiments will always be 22 xt. The null hypothesis that o, = 0 is an interesting one as the value lies on the boundary of the parameter space and, as Chernoff [4] pointed out, the likelihood ratio test would not be x in such a situation. However, as Chant [3] observes, in this non-standard situation the LM statistic will be x in large samples. Table I records the predicted probability of Type I error from the asymptotic theory (column 1) and the exact probabilities for various sample sizes for the two models. From Table I the adequacy of the asymptotic theory to indicate correct significance levels is rather suspect. Certainly, it would appear that investigators might need to use fairly conservative significance levels. Because of this diver- gence of asymptotic predictions and small sample results, we examined the ability of the simulation method to provide the user with good approximations to the true probability of Type I error. Table II contains a comparison of the exact prob- abilities generated by the Imhof method with those from the simulation method (TIME is the C.P.U. time in seconds for computing the whole column on a UNIVAC 1142). This content downloaded from 141.164.71.238 on Sat, 27 Apr 2013 15:04:51 PM All use subject to JSTOR Terms and Conditions 1292 T. S. BREUSCH AND A. R. PAGAN TABLE I PROBABILITY OF TYPE I ERRORS FOR VARIOUS SAMPLE SIZES Model I Model II N=co N=20 N=40 N=49 N=20 N=40 N=60 .7 .691 .698 .710 .701 .690 .695 .5 .449 .460 .487 .497 .484 .490 .4 .312 .324 .362 .393 .381 .388 .3 .172 .183 .229 .287 .280 .286 .2 .058 .066 .104 .180 .180 .185 .1 .018 .022 .033 .078 .084 .088 .05 .010 .013 .021 .034 .040 .042 .02 .005 .008 .012 .013 .016 .017 .01 .003 .005 .009 .007 .008 .009 .005 .002 .004 .006 .004 .005 .005 TABLE II SIMULATION AND IMHOF METHODS FOR EVALUATING PROBABILITIES, MODEL I N=20 N=40 SIM IMHOF SIM IMHOF .6970 .6906 .7204 .7096 .4594 .4488 .4834 .4873 .3212 .3121 .3608 .3620 .1756 .1720 .2230 .2293 .0536 .0579 .1034 .1041 .0192 .0179 .0352 .0330 .0106 .0101 .0222 .0207 .0048 .0050 .0126 .0123 .0024 .0031 .0088 .0086 .0016 .0019 .0064 .0062 Time 34.5 71.4 92.0 703.3 Assuming that the Imhof method yields the exact probabilities it is seen that the simulation method provides a reliable guide to the evaluation of these in any applied situation-certainly errors are minor compared to those from the use of the asymptotic theory. It is of interest to note that the simulation method, based on 5000 replications, was considerably faster than the Imhof method. To assess the power of the statistic it is necessary to specify particular numerical values for the ratio a2/a, (i.e., oa2/_2). Because the model being investigated is a random coefficient one, it seemed sensible to relate power to the coefficient of variation (CV) of 6, i.e., o/,I. By choosing o-u = 0.3 and ,/ as 0.66 for Model II (roughly the OLS estimates), a2 was found for values of CV of 0.1, 1.0, and 10.0, these constituting a reasonable range of randomness in /3g. Table III records the rejection probabilities of the test statistic for the 10 critical values of Table I, as the CV and sample size change.5 5 The tabulated figures differ from power because of the variations in Type I error levels given in Table I. This content downloaded from 141.164.71.238 on Sat, 27 Apr 2013 15:04:51 PM All use subject to JSTOR Terms and Conditions TEST FOR HETEROSCEDASTICITY 1293 From Table III it appears that, at least for the range of the CV considered, sample size is the main determinant of power and this is acceptable for sample sizes of forty and above, for those significance levels most commonly in use, i.e., 5 per cent and 10 per cent. TABLE III POWER CALCULATIONS FOR MODEL II N=20 N=40 N=60 CV=0.1 CV = 1.0 CV = 10.0 CV=0^1 CV = 1.0 CV= 10.0 CV=0.1 CV = 1.0 CV= 10.0 .935 .943 0.944 0.995 0.996 0.996 1.000 1.000 1.000 .879 .893 0.895 0.988 0.990 0.990 0.999 1.000 1.000 .842 .859 0.861 0.982 0.984 0.985 0.999 0.999 0.999 .793 .815 0.817 0.971 0.974 0.975 0.998 0.998 0.999 .725 .750 0.752 0.950 0.955 0.956 0.996 0.997 0.997 .608 .637 0.640 0.900 0.909 0.910 0.989 0.991 0.991 .498 .528 0.531 0.835 0.848 0.849 0.977 0.980 0.981 .369 .398 0.401 0.734 0.751 0.753 0.952 0.958 0.958 .288 .313 0.316 0.653 0.673 0.672 0.925 0.933 0.934 .220 .242 0.245 0.573 0.592 0.594 0.891 0.902 0.903 4. COMMENTS AND CONCLUSION 1. Although the statistic can be employed only when an alternative is specified, its derivation suggests that the quantity gt = uat/cro is of some importance in tests of heteroscedasticity. Thus, if one is going to plot any quantity-a strategy sometimes recommended-it would seem to be more reasonable to plot gt than quantities such as uit. 2. The test statistic is easily extended to systems of equations and in some circumstances can be expressed in a simpler form, e.g., if it is assumed that there is a discrete change in o_t2 after n observations, as is common in many cross-sectional studies, then Z2,t would be unity from 1 to n and zero elsewhere, from which the statistic becomes LM=[1 N ][ t na) ] This essentially involves a comparison of the residual sum of squares over the first n periods and the total sample. Unfortunately, even in this special case the distribution of the statistic does not appear tractable analytically. 3. Finally, there is the question of the power of the test statistic versus (say) the likelihood ratio (LR) statistic, i.e., even if the LM statistic did not have good power in small samples, it may be no worse than others and its computational ease migh

                    本文档为【Breusch-Pagan Test 的原始文章（英文计量经济学文献）】，请使用软件OFFICE或WPS软件打开。作品中的文字与图均可以修改和编辑，
                    图片更改请在作品中右键图片并更换，文字修改请直接点击文字进行修改，也可以新增和删除文档中的内容。 
 该文档来自用户分享，如有侵权行为请发邮件ishare@vip.sina.com联系网站客服，我们会及时删除。

                    [版权声明] 本站所有资料为用户分享产生，若发现您的权利被侵害，请联系客服邮件isharekefu@iask.cn，我们尽快处理。

                    本作品所展示的图片、画像、字体、音乐的版权可能需版权方额外授权，请谨慎使用。

                    网站提供的党政主题相关内容(国旗、国徽、党徽..)目的在于配合国家政策宣传，仅限个人学习分享使用，禁止用于任何广告和商用目的。
                

下载需要：免费已有0 人下载

立即下载

Breusch-Pagan Test 的原始文章（英文计量经济学文献）

你可能还喜欢