Time series analysis时间序列统计方法

Time series analysis时间序列统计方法CHAPTER 15 Time Series Analysis page 651 15. Introduction page 651 Chapter 15 deals with the basic components of time series, time series decomposition, and simple forecasting. At the end of this chapter, you should know the following: l The four possible...

CHAPTER 15 Time Series Analysis page 651 15. Introduction page 651 Chapter 15 deals with the basic components of time series, time series decomposition, and simple forecasting. At the end of this chapter, you should know the following: l The four possible components of a time series. l How to use smoothing techniques to remove the random variation and identify the remaining components. l How to use the linear and quadratic regression models to analyze the trend. l How to measure the cyclical effect using the percentage of trend method. l How to measure the seasonal effect by computing the seasonal indices. l How to calculate MAD and RMSE to determine which forecasting model works best. Definition A time series is a collection of data obtained by observing a response variable at periodic points in time. Definition If repeated observations on a variable produce a time series, the variable is called a time series variable. We use Yi to denote the value of the variable at time i. Four possible components: Trend ( secular trend) -- Long term pattern or direction of the time series Cycle ( cyclical effect) -- Wavelike pattern that varies about the long-term trend, appears over a number of years e.g. business cycles of economic boom when the cycle lies above the trend line and economic recession when the cycle lies below the secular trend. Seasonal variation -- Cycles that occur over short periods of time, normally < 1 year e.g. monthly, weekly, daily. Random variation ( residual effect) --Random or irregular variation that a time series shows Could be additive: Yi = Ti + Ci + Si + Ii or multiplicative: Yi = Ti x Ci x Si xIi Forecasting using smoothing techniques The two commonly used smoothing techniques for removing random variation from a time series are moving averages and exponential smoothing. Moving average: ( MA) Moving averages involve averaging the time series over a specified number of periods. We usually choose odd number of periods so we can center the averages at particular periods for graphing purposes. If we use an even period, we may center the averages by finding two-period moving averages of the moving averages. Moving averages aid in identifying the secular trend of a time series because the averaging modifies the effect of cyclical or seasonal variation. i.e. a plot of the moving averages yields a “smooth” time series curve that clearly shows the long term trend and clearly shows the effect of averaging out the random variations to reveal the trend. Moving averages are not restricted to any periods or points. For example, you may wish to calculate a 7point moving average for daily data, a 12point moving average for monthly data, or a 5point moving average for yearly data. Although the choice of the number of points is arbitrary, you should search for the number N that yields a smooth series, but is not so large that many points at the end of the series are "lost." The method of forecasting with a general Lpoint moving average is outlined below where L is the length of the period. Forecasting Using an LPoint Moving Average 1. Select L, the number of consecutive time series values Y1, Y2. . . YL that will be averaged. (The time series values must be equally spaced.) 2. Calculate the Lpoint moving total, by summing the time series values over L adjacent time periods. 3. Compute the Lpoint moving average, MA, by dividing the corresponding moving total by L 4. Graph the moving average MA on the vertical axis with time i on the horizontal axis. (This plot should reveal a smooth curve that identifies the longterm trend of the time series.) Extend the graph to a future time period to obtain the forecasted value of MA. Exponential smoothing: One problem with using a moving average to forecast future values of a time series is that values at the ends of the series are lost, thereby requiring that we subjectively extend the graph of the moving average into the future. No exact calculation of a forecast is available since the moving average at a future time period t requires that we know one or more future values of the series. Exponential smoothing is a technique that leads to forecasts that can be explicitly calculated. Like the moving average method, exponential smoothing de-emphasizes (or smoothes) most of the residual effects. To obtain an exponentially smoothed time series, we first need to choose a weight W, between 0 and 1, called the exponential smoothing constant. The exponentially smoothed series, denoted Ei, is then calculated as follows: Ei= W Yi+(1- W)Ei-1 (for i>=2) where Ei = exponentially smoothed time series Yi = observed value of the time series at time i Ei-1 = exponentially smoothed time series at time i-1 W = smoothing constant, where 0<= W <=1 Begin by setting E1=Y1 E2= W Y2+(1- W)E1 E3= W Y3+(1- W)E2 . . Ei= W Yi+(1- W)Ei-1 You can see that the exponentially smoothed value at time i is simply a weighted average of the current time series value, Yi, and the exponentially smoothed value at the previous time period, Ei-1. Smaller values of W give less weight to the current value, Yi. Whereas larger values give more weight to Yi l The formula indicates that the smoothed time series in period i depends on all the previous observations of the time series. l The smoothing constant W is chosen on the basis of how much smoothing is required. A small value of W produces a great deal of smoothing. A large value of W results in very little smoothing. Exponential smoothing helps to remove random variation in a time series. Because it uses the past and current values of the series, it is useful for forecasting time series. The objective of the time series analysis is to forecast the next value of the series, Fi+1. The exponentially smoothed forecast for Fi+1= Ei where Ei= W Yi+(1- W)Ei-1 is the forecast of Yi+1 since the exponential smoothing model assumed that the time series has little or no trend or seasonal component. The forecast Fi is used to forecast not only Yi+1 but also all future value of Yi. i.e. Fi = W Yi+(1- W)Ei-1, i = n + 1, n + 2, . . . This points out one disadvantage of the exponential smoothing forecasting technique. Since the exponentially smoothed forecast is constant for all future values, any changes in trend and/or seasonality are not taken into account. Therefore, exponentially smoothed forecasts are appropriate only when the trend and seasonal components of the time series are relatively insignificant. Forecasting: The Regression Approach Many firms use past sales to forecast future sales. Suppose a wholesale distributor of sporting goods is interested in forecasting its sales revenue for each of the next 5 years. Since an inaccurate forecast may have dire consequences to the distributor, some measure of the forecast’s reliability is required. To make such forecasts and assess their reliability, an inferential time series forecasting model must be constructed. The familiar general linear regression model represents one type of inferential model since it allows us to calculate prediction intervals for the forecasts. YEAR SALES YEAR SALES YEAR SALES t y t y t y 1 4.8 13 48.4 25 100.3 2 4.0 14 61.6 26 111.7 3 5.5 15 65.6 27 108.2 4 15.6 16 71.4 28 115.5 5 23.1 17 83.4 29 119.2 6 23.3 18 93.6 30 125.2 7 31.4 19 94.2 31 136.3 8 46.0 20 85.4 32 146.8 9 46.1 21 86.2 33 146.1 10 41.9 22 89.9 34 151.4 11 45.5 23 89.2 35 150.9 12 53 5 24 99.1 To illustrate the technique of forecasting with regression, consider the data in the Table above. The data are annual sales (in thousands of dollars) for a firm (say, the sporting goods distributor) in each of its 35 years of operation. A scatter plot of the data is shown below and reveals a linearly increasing trend , so the firstorder (straightline) model E(Yi) = βo + β1i seems plausible for describing the secular trend. The regression analysis printout for the model gives R2 = .98, F = 1,615.724 (pvalue < .0001), and s = 6.38524. The least squares prediction equation is Yi = Bo + B1i = .401513 + 4.295630i The prediction intervals for i = 36, 37, . . ., 40 widen as we attempt to forecast farther into the future. Intuitively, we know that the farther into the future we forecast, the less certain we are of the accuracy of the forecast since some unexpected change in business and economic conditions may make the model inappropriate. Since we have less confidence in the forecast for, say, i = 40 than for t =36, it follows that the prediction interval for i = 40 must be wider to attain a 95% level of confidence. For this reason, time series forecasting (regardless of the forecasting method) is generally confined to the short term. Multiple regression models can also be used to forecast future values of a time series with seasonal variation. We illustrate with an example. E X A M P L E Refer to the 19911994 quarterly power loads listed in the attached Table . a. Propose a model for quarterly power load, y`, that will account for both the secular trend and seasonal variation present in the series. b. Fit the model to the data, and use the least squares prediction equation to forecast the utility company's quarterly power loads in 1995. Construct 95% confidence intervals for the forecasts. Solution a. A common way to describe seasonal differences in a time series is with dummy variables. For quarterly data, a model that includes both trend and seasonal components is E(Yi) = Bo + Bli + B2X1 + B3X2 + B4X3 Trend Seasonal component where where i = Time period, ranging from i = 1 for quarter I of 1991 to i = 16 for quarter IV of 1994 Yi = Power load (megawatts) in time i X1_= 1 if quarter I X2 =1 if quarter II O if not O if not X3 =1 if quarter III Base level = quarter IV O if not The coefficients associated with the seasonal dummy variables determine the mean increase (or decrease) in power load for each quarter, relative to the base level quarter, quarter IV. b. The model is fitted to the data using the SAS multiple regression routine. The resulting SAS printout is shown below. Note that the model appears to fit the data quite well: R2 = .9972, indicating that the model accounts for 99.7% of the sample variation in power loads over the 4year period; F = 968.962 strongly supports the hypothesis that the model has predictive utility (pvalue = .0001); and the standard deviation, Root MSE = 1.53242, implies that the model predictions will usually be accurate to within approximately +2(1.53), or about +3.06 megawatts. Forecasts and corresponding 95% prediction intervals for the 1995 power loads are reported in the bottom portion of the printout . For example, the forecast for power load in quarter I of 1995 is 184.7 megawatts with the 95% prediction interval (180.5, 188.9). Therefore, using a 95% prediction interval, we expect the power load in quarter I of 1995 to fall between 180.5 and 188.9 megawatts. Recall from the Table that the actual 1995 quarterly power loads are 181.5, 175.2, 195.0 and189.3, respectively. Note that each of these falls within its respective 95% prediction interval shown. Many descriptive forecasting techniques have proved their merit by providing good forecasts for particular applications. Nevertheless, the advantage of forecasting using the regression approach is clear: Regression analysis provides us with a measure of reliability for each forecast through prediction intervals. However, there are two problems associated with forecasting time series using a multiple regression model. PROBLEM 1 We are using the least squares prediction equation to forecast values outside the region of observation of the independent variable, t. For example, suppose we are forecasting for values of t between 17 and 20 (the four quarters of 1995), even though the observed power loads are for t values between 1 and 16. As noted earlier, it is risky to use a least squares regression model for prediction outside the range of the observed data because some unusual change—economic, political, etc.—may make the model inappropriate for predicting future events. Because forecasting always involves predictions about future values of a time series, this problem obviously cannot be avoided. However, it is important that the forecaster recognize the dangers of this type of prediction. PROBLEM 2 Recall the standard assumptions made about the random error component of a multiple regression model . We assume that the errors have mean 0, constant variance, normal probability distributions, and are independent. The latter assumption is often violated in time series that exhibit short term trends. As an illustration, refer to the plot of the sales revenue data. Notice that the observed sales tend to deviate about the least squares line in positive and negative runs. That is, if the difference between the observed sales and predicted sales in year t is positive (or negative), the difference in year t + 1 tends to be positive (or negative). Since the variation in the yearly sales is systematic, the implication is that the errors are correlated. Violation of this standard regression assumption could lead to unreliable forecasts. Measuring forecast accuracy (MAD) Forecast error is defined as the actual value of the series at time i minus the forecast value, i.e. Yi - ? i. This can be used to evaluate the accuracy of the forecast. Two of the procedures for the evaluation are to find the mean absolute deviation (1) MAD= | Yi - ? i | /N and (2) RMSE= ( Yi - ? i)2 /N Descriptive Analysis: Index numbers Time series data like other sets of data are subject to two kinds of analyses: Descriptive and inferential analysis The most common method to describe a business or economic time series is to compute index numbers. INDEX NUMBERS Definition: An index number is a number that measures the change in a variable over time relative to the value of the variable during a specific base period. The two most important are price index and quantity index. Price indexes measure changes in price of a commodity or a group of commodities over time. The Consumer Price Index (CPI) is a price index, which measures price changes of a group of commodities over time. On the other hand, an index constructed to measure the change in the total number of commodities produced annually is a good example of a quantity index, computations here could be complicated.

                    本文档为【Time series analysis时间序列统计方法】，请使用软件OFFICE或WPS软件打开。作品中的文字与图均可以修改和编辑，
                    图片更改请在作品中右键图片并更换，文字修改请直接点击文字进行修改，也可以新增和删除文档中的内容。 
 该文档来自用户分享，如有侵权行为请发邮件ishare@vip.sina.com联系网站客服，我们会及时删除。

                    [版权声明] 本站所有资料为用户分享产生，若发现您的权利被侵害，请联系客服邮件isharekefu@iask.cn，我们尽快处理。

                    本作品所展示的图片、画像、字体、音乐的版权可能需版权方额外授权，请谨慎使用。

                    网站提供的党政主题相关内容(国旗、国徽、党徽..)目的在于配合国家政策宣传，仅限个人学习分享使用，禁止用于任何广告和商用目的。
                

下载需要：免费已有0 人下载

立即下载

Time series analysis时间序列统计方法

你可能还喜欢