State space in Stata

State space in Stata JSS Journal of Statistical Software May 2011, Volume 41, Issue 10. http://www.jstatsoft.org/ State Space Methods in Stata David M. Drukker Stata Richard B. Gates Stata Abstract We illustrate how to estimate parameters of linear state-space models using t...

JSS Journal of Statistical Software May 2011, Volume 41, Issue 10. http://www.jstatsoft.org/ State Space Methods in Stata David M. Drukker Stata Richard B. Gates Stata Abstract We illustrate how to estimate parameters of linear state-space models using the Stata program sspace. We provide examples of how to use sspace to estimate the parame- ters of unobserved-component models, vector autoregressive moving-average models, and dynamic-factor models. We also show how to compute one-step, filtered, and smoothed estimates of the series and the states; dynamic forecasts and their confidence intervals; and residuals. Keywords: state-space, unobserved-components models, local-level model, local-linear-trend model, basic structural model, dynamic-factor model, vector autoregressive moving-average model, sspace. 1. Introduction Stata is a general purpose package for statistics, graphics, data management, and matrix language programming. Stata’s coverage of statistical areas is one of the most complete available, with many commands for regression analysis (StataCorp 2009k,l,m), multivariate statistics (StataCorp 2009i), panel-data analysis (StataCorp 2009h), survey data analysis (StataCorp 2009n), survival analysis and epidemiology statistics (StataCorp 2009o), and time- series analysis (StataCorp 2009p). It is used for data management (Mitchell 2010), health research (Juul and Frydenberg 2010; Cleves, Gould, Gutierrez, and Marchenko 2010), as well as in economic analysis (Cameron and Trivedi 2009; Baum 2006). Stata is also a programming language used by researchers to implement and disseminate their methods; see any of the more than 40 issues of The Stata Journal for examples of peer-reviewed user-written programs and see StataCorp (2009j,f,g) for Stata’s programming capabilities. The Stata command sspace, released in version 11, estimates the parameters of linear state- space models by maximum likelihood (StataCorp 2009e). As demonstrated by Harvey (1989) and Commandeur, Koopman, and Ooms (2011), linear state-space models are very flexible, ACER 高亮 2 State Space Methods in Stata and many linear time-series models can be written as linear state-space models. In this article, we show how to use sspace to estimate the parameters of linear state-space models. We also note that Stata has some additional commands, such as dfactor, which provide simpler syntaxes for estimating the parameters of particular linear state-space models. Because of this flexibility, sspace has two syntaxes; we call them the covariance-form syntax and the error-form syntax. They are illustrated by estimating the parameters of a local- linear-trend model with a seasonal component and a vector autoregressive moving-average (VARMA) model, respectively. In each syntax, the user must specify one or more state equations, one or more observation equations, and the stochastic components. 2. Case 1: The local-level model The local-level model is described by Commandeur et al. (2011, Section 2.1) and we briefly review it here. The observation and state equations of this model are yt = µt + �t, µt = µt−1 + ξt, (1) respectively, where �t ∼ N(0, σ2� ) and ξt ∼ N(0, σ2ξ ) and both are independent. We express the level component at time t, µt, as a function of that at time t−1. This notation is a subtle change from that in Commandeur et al. (2011), but it is more consistent with the syntax of Stata’s sspace for describing the model and how sspace executes the state-space recursions by starting with index 0 instead of 1. The parameters in this model are σ2� , σ 2 ξ , and µ0. 2.1. Covariance-form syntax The covariance-form syntax of sspace is as follows: sspace state_eq [state_eq ... state_eq] obs_eq [obs_eq ... obs_eq] [if] [in] [, options] where state_eq are state equations of the form (statevar [lagged_statevars] [indepvars], state [noerror noconstant covstate(covform)]) and obs_eq are observation equations of the form (depvar [statevars] [indepvars] [, noerror noconstant covobserved(covform)]) A list of state equations, observation equations, and options specifies an sspace model. The square brackets indicate optional arguments, so the syntax diagram indicates that at least one state equation and one observation equation are required. Each equation must be enclosed in parentheses. In Stata parlance, a comma in the command toggles the parser from model specification mode to options specification mode. Options included within an equation are applied to that equation. Options specified outside the individual equations are applied to the model as a whole. ACER 高亮 ACER 高亮 ACER 高亮 ACER 高亮 ACER 高亮 ACER 高亮 ACER 高亮 ACER 高亮 ACER 高亮 Journal of Statistical Software 3 Each state equation specifies the name of a latent variable and must have the state option specified. A state equation optionally contains a list of lagged state variables and a list exogenous covariates. By default, a constant is included in the equation unless the noconstant option is specified. By default, an error term is included in the equation unless the noerror option is specified. The option covstate() allows you to specify the covariance structure of the state equations. The covform in the syntax diagram may be identity, dscalar, diagonal, or unstructured. The default is diagonal. The option dscalar states that the covariance is diagonal and that all the variance terms are equal. Each observation equation specifies the name of an observed dependent variable. An observa- tion equation optionally contains a list of contemporaneous state variables and a list exogenous covariates. By default, a constant is included in the equation unless the noconstant option is specified. By default, an error term is included in the equation unless the noerror option is specified. The option covobserved() allows you to specify the covariance structure of the observation equations. The covariance forms are the same as the option covstate(). The [if] and the [in] specifications allow you to estimate the parameters using a subsample of the observations. The options in the main syntax diagram include model, optimization, and display options. An important model option is constraints(), parameter constraints that identify the model. A popular optimization option is the technique() option. Two good techniques for sspace are technique(BHHH), or the Berndt-Hall-Hall-Hausman technique; and the technique(NR), for Newton-Raphson. Optimization techniques may be mixed; such is the default, technique (BHHH 5 NR), which specifies the BHHH method for the first 5 iterations and NR for the remaining iterations. An example of a display option is level(), which allows you to set the confidence level to something other than the default of 95%. We clarify this syntax in the following example. 2.2. Estimating the variances of a local-level model using sspace Here we illustrate the sspace syntax by estimating the parameters of the local-level model on the well-known Nile dataset containing observations on the annual Nile River flow volume at Aswan, Egypt, from 1870 to 1970. The Stata command use loads the dataset into memory and the command describe describes it. . use http://www.stata.com/ddrukker/nile.dta (Nile river annual flow volume at Aswan from 1870 to 1970) The describe command will display a dataset’s size, its variables, their storage type and format, any labels associated with the variables, sorting information, and any descriptive information that you have added to document your data. . describe Contains data from data/nile.dta obs: 100 Nile river annual flow volume at Aswan from 1870 to 1970 vars: 2 16 Jun 2008 10:49 ACER 高亮 ACER 高亮 ACER 高亮 ACER 高亮 ACER 高亮 ACER 高亮 4 State Space Methods in Stata size: 1,200 (99.9% of memory free) ------------------------------------------------------------------------------ storage display value variable name type format label variable label ------------------------------------------------------------------------------ AFV long %12.0g Annual Flow Volume year long %ty ------------------------------------------------------------------------------ Sorted by: year Stata computes time-series operators of variables using a time variable specified by the tsset command. Below we specify year to be our time variable; we tsset the data, in Stata parlance. . tsset year time variable: year, 1871 to 1970 delta: 1 year We could now use sspace to estimate the parameters using the code constraint define 1 [level]L.level = 1 constraint define 2 [AFV]level = 1 sspace (level L.level, state noconstant) /// (AFV level, noconstant), /// constraints(1 2) While this code is transparent to Stata users, we discuss it in some detail for readers who are unaccustomed to Stata. The first two lines define constraints on the model parameters, as discussed below. The third line begins with the command sspace and is followed by the definition of the state equation (level L.level, state noconstant) which is best understood from right to left. The option noconstant specifies that there is no constant term in the equation; the option state specifies the equation as a state equation; and the comma separates the options from equation specification. By specifying the equation as level L.level, we specify level as the name for the unobserved state and we specify that the state equation is levelt = αlevelt−1 We use Stata’s lag operator, L. in this example, to model level as a linear function of the lagged level. At the end of third line, the three slashes, ///, denote a line continuation in Stata. In this example, we see that lines 3, 4, and 5 compose a single Stata command. The fourth line specifies that the observation equation in the model is AFVt = βlevelt + �t Journal of Statistical Software 5 where the �t are independent and identically distributed (IID) normal errors. As in the state equation above, we used the noconstant option to suppress the constant term. The model in Equation (1) requires that α = β = 1. Lines 1 and 2 declare these constraints; on line 4, the option constraints(1 2) applies them to this model. Repeating the code, we proceed with estimation: . constraint define 1 [level]L.level = 1 . constraint define 2 [AFV]level = 1 . sspace (level L.level, state noconstant) /// > (AFV level, noconstant), /// > constraints(1 2) searching for initial values ... (setting technique to bhhh) Iteration 0: log likelihood = -635.14379 Iteration 1: log likelihood = -633.9615 Iteration 2: log likelihood = -633.60088 Iteration 3: log likelihood = -633.57318 Iteration 4: log likelihood = -633.54533 (switching technique to nr) Iteration 5: log likelihood = -633.51888 Iteration 6: log likelihood = -633.46465 Iteration 7: log likelihood = -633.46456 Iteration 8: log likelihood = -633.46456 Refining estimates: Iteration 0: log likelihood = -633.46456 Iteration 1: log likelihood = -633.46456 State-space model Sample: 1871 - 1970 Number of obs = 100 Log likelihood = -633.46456 ( 1) [level]L.level = 1 ( 2) [AFV]level = 1 ------------------------------------------------------------------------------ | OIM AFV | Coef. Std. Err. z P>|z| [95% Conf. Interval] -------------+---------------------------------------------------------------- level | level | L1. | 1 . . . . . -------------+---------------------------------------------------------------- AFV | level | 1 . . . . . -------------+---------------------------------------------------------------- var(level) | 1469.176 1280.375 1.15 0.251 -1040.313 3978.666 var(AFV) | 15098.52 3145.548 4.80 0.000 8933.358 21263.68 ------------------------------------------------------------------------------ Note: Model is not stationary. Note: Tests of variances against zero are conservative and are provided only for reference. 6 State Space Methods in Stata Commandeur et al. e() result name notation e(A) T e(B) e(C) R e(chol_Q) Q1/2 e(D) Z e(F) e(G) e(chol_R) H1/2 Table 1: Kalman filter matrices in Stata’s e() results and their Commandeur et al. (2011) equivalents. The output table reports that sspace estimates σ2ξ to be 1,469.2 and σ 2 � to be 15,098.5. Having provided a simple example of how to use sspace, we now provide some technical details about its implementation. sspace uses the Mata optimizer optimize() (StataCorp 2009c). sspace uses analytic first derivatives, from which it numerically computes the second order derivatives necessary for Newton-Raphson optimization. If you are using the multiprocessor version of Stata (Stata MP), the numerical second derivatives are computed in parallel. optimize() will not declare convergence until the length of the scaled gradient is smaller than 10−6. That is when gTk Ĥ −1 k gk < 10 −6, where gk is the gradient on the k-th step and Ĥk is the approximated negative Hessian. The requirement that Ĥk be nonsingular prevents sspace from declaring convergence when the parameters are not identified, as discussed in Drukker and Wiggins (2004). The standard errors are computed from the negative Hessian unless the variance-covariance option, vce(), specifies otherwise. The OIM in the table header for the standard errors indi- cates that the standard errors are computed from the observed information matrix. If non- normal errors are suspected, use vce(robust) to obtain the Huber-White robust standard errors (StataCorp 2009q, robust). Stata estimation commands store their results in a memory region called ereturn. The results may be accessed by the user and are used by other Stata commands, which are referred to as postestimation commands in Stata parlance. Typing . ereturn list lists the results saved in e(). You may view or access any e() result by identifying the object as e(name), where name is the name of the object. The matrices saved off by sspace are listed in Table 1 along with the Commandeur et al. (2011, Equations 1 and 2) equivalents. Mixing both notations, a linear state-space model is αt = Tαt−1 + Bxt + Rηt yt = Zαt + Fwt + G�t, ACER 高亮 ACER 高亮 ACER 高亮 ACER 高亮 ACER 高亮 Journal of Statistical Software 7 where xt and wt are column vectors of covariates. The vector wt may contain lagged inde- pendent variables specified on the left-hand side of observation equations. Commandeur et al. (2011) incorporate the regression coefficent matrices B and F into the state transition matrix T and the observation equation matrix Z, respectively. The Kalman filter recursions are initialized with α1 = Tα0 + Bx1. In this example the matrices are all 1 × 1, and we have e(A) = 1, e(D) = 1, e(chol_Q) = √ var(level), and e(chol_R) = √ var(AFV). The remaining matrices do not exist for this model. Stata’s sspace uses the square-root filter to numerically implement the Kalman filter recur- sions (DeJong 1991b; Durbin and Koopman 2001, Section 6.3). Moreover, when the model is not stationary, as is the case here, the filter is augmented as described by DeJong (1991a), DeJong and Chu-Chun-Lin (1994), and Durbin and Koopman (2001, Section 5.7). The two techniques are used together to evaluate the likelihood (DeJong 1988) and to provide maxi- mum likelihood (ML) estimates of the parameters of the state-space model. The techniques also provide an estimate of the initial state. The initial state, α0 = µ0 is diffuse and is mod- eled as var(µ0) → ∞ and E[µ0] = δ. The ML estimate of δ is 1120.0. This quantity is not reported by sspace, but is stored as e(d). We can obtain predictions using the predict command, after estimating the parameters. All the standard objects and their standard errors can be predicted using predict after sspace. These objects and the syntax for predict after sspace are discussed in StataCorp (2009d). 2.3. Case 1 postestimation With the local-level model estimates still in memory we predict the smoothed trend of the Nile annual flow volume using the DeJong (1989) diffuse Kalman filter. Here we use the rmse option to obtain the smoothed trend root-mean-square error (RMSE) that is subsequently used to compute 90% confidence intervals. A second call to predict obtains the standardized residuals. We graph the series, trend, and trend confidence intervals in one graph and the standardized residuals in a second graph. We then combine the two graphs into one and allow it to render. This graph is displayed in Figure 1. . predict trend, state equation(level) smethod(smooth) rmse(rmse) . . scalar z = invnormal(.95) . gen lb = trend - z*rmse . gen ub = trend + z*rmse . . predict res, rstandard . . twoway (tsline AFV trend) (tsrline lb ub), tlabel(1870(50)1970) /// > ytitle(Annual Flow Volume) name(AFV) nodraw legend(off) . . tsline res, yline(3 -3) yline(0) tlabel(1870(50)1970) name(RES) nodraw . . graph combine AFV RES, name(AFVR) rows(2) Next, we demonstrate forecasting. First we use the preserve command to save the original dataset. We then extend the data by 10 years using the tsappend command. We compute ACER 高亮 ACER 高亮 ACER 高亮 ACER 高亮 ACER 高亮 8 State Space Methods in Stata Figure 1: In the upper panel we display the Nile annual flow volume time-series (blue) with smoothed trend estimates (red) and trend 90% confidence intervals. The lower panel displays the standardized residuals. the one-step predictions, compute dynamic forecasts from 1971 to 1980, and compute the RMSE’s for the predictions and forecast predictions. We then compute the 50% confidence intervals for the forecasts and graph the results. Finally, we restore the original dataset. The graph is shown in Figure 2. . preserve . tsappend, add(10) . predict flow, dynamic(1971) rmse(rflow) . scalar z = invnormal(.75) . gen lb = flow - z*rflow (1 missing value generated) . gen ub = flow + z*rflow (1 missing value generated) . twoway (tsline AFV flow) (tsrline lb ub if year>=1970), /// > tlabel(1870(10)1980) ytitle(Annual Flow Volume) name(FOR1) xline(1970) /// > legend(label(1 "AFV") label(2 "predicted/forecast") label(3 "50% CI")) . restore ACER 高亮 ACER 高亮 ACER 高亮 Journal of Statistical Software 9 Figure 2: The Nile river annual flow volume (blue), one-step predictions and dynamic forecasts (red), and forecast 50% confidence intervals. 3. Case 2: A local-linear-trend model In this section we review the structure of a local-linear-trend model with an autoregressive component, AR(1), and a seasonal component. The state-space form of a time-domain sea- sonal component is described in Commandeur et al. (2011, Section 2.1). Our state-space model is µt =µt−1 + νt−1 + ξt, (2) νt =νt−1, (3) ηt =φ · ηt−1 + ζt, (4) γ1,t =− γ1,t−1 − γ2,t−1 − γ3,t−1 + ωt, (5) γ2,t =γ1,t, (6) γ3,t =γ2,t, (7) yt =µt + ηt + γ1,t, (8) where ζt ∼ NID(0, σ2ζ ), ξt ∼ NID(0, σ2ξ ), and ωt ∼ NID(0, σ2ω). Equation (8) is the observation equation and it depends on the states µ (the linear trend), η (the AR(1) term), and γ1 (the seasonal component). The observation equation has no error term. The model has six state equations: two for the linear trend, one for the AR(1) component and three for the seasonal component. ACER 高亮 ACER 高亮 ACER 高亮 ACER 高亮 ACER 高亮 ACER 高亮 10 State Space Methods in Stata 3.1. Estimating parameters of the local-linear-trend model using sspace We now use sspace to estimate the parameters of a local-linear-trend model with an AR(1) component and a seasonal component. We fit this model to quarterly data on the food and tobacco production (FTP) in the United States for the years 1947 to 2000. Cox (2009) uses the dataset to demonstrate graphing seasonal time-series data in Stata. First we read the dataset into memory and describe it: . use http://www.stata.com/ddrukker/ftp.dta (Food and tobacco production in the United States for 1947-2000) . describe Contains data from data/ftp.dta obs: 216 Food and tobacco production in the United States for 1947-2000 vars: 2 11 Jan 2010 10:02 size: 2,592 (99.9

                    本文档为【State space in Stata】，请使用软件OFFICE或WPS软件打开。作品中的文字与图均可以修改和编辑，
                    图片更改请在作品中右键图片并更换，文字修改请直接点击文字进行修改，也可以新增和删除文档中的内容。 
 该文档来自用户分享，如有侵权行为请发邮件ishare@vip.sina.com联系网站客服，我们会及时删除。

                    [版权声明] 本站所有资料为用户分享产生，若发现您的权利被侵害，请联系客服邮件isharekefu@iask.cn，我们尽快处理。

                    本作品所展示的图片、画像、字体、音乐的版权可能需版权方额外授权，请谨慎使用。

                    网站提供的党政主题相关内容(国旗、国徽、党徽..)目的在于配合国家政策宣传，仅限个人学习分享使用，禁止用于任何广告和商用目的。
                

下载需要：免费已有0 人下载

立即下载

State space in Stata

你可能还喜欢