se
,1, R
terdam
tatist
where that have been highly influential to various
developments in the field. The works referenced
International Journal of Forecasting 22 (2006) 443–473
* Corresponding author. Tel.: +61 3 9905 2358; fax: +61 3 9905
5474.
Abstract
We review the past 25 years of research into time series forecasting. In this silver jubilee issue, we naturally highlight results
published in journals managed by the International Institute of Forecasters (Journal of Forecasting 1982–1985 and
International Journal of Forecasting 1985–2005). During this period, over one third of all papers published in these journals
concerned time series forecasting. We also review highly influential works on time series forecasting that have been published
elsewhere during this period. Enormous progress has been made in many areas, but we find that there are a large number of
topics in need of further development. We conclude with comments on possible future research directions in this field.
D 2006 International Institute of Forecasters. Published by Elsevier B.V. All rights reserved.
Keywords: Accuracy measures; ARCH; ARIMA; Combining; Count data; Densities; Exponential smoothing; Kalman filter; Long memory;
Multivariate; Neural nets; Nonlinearity; Prediction intervals; Regime-switching; Robustness; Seasonality; State space; Structural models;
Transfer function; Univariate; VAR
1. Introduction
The International Institute of Forecasters (IIF) was
established 25 years ago and its silver jubilee provides
an opportunity to review progress on time series
forecasting. We highlight research published in
journals sponsored by the Institute, although we also
cover key publications in other journals. In 1982, the
IIF set up the Journal of Forecasting (JoF), published
with John Wiley and Sons. After a break with Wiley
in 1985,2 the IIF decided to start the International
Journal of Forecasting (IJF), published with Elsevier
since 1985. This paper provides a selective guide to
the literature on time series forecasting, covering the
period 1982–2005 and summarizing over 940 papers
including about 340 papers published under the bIIF-
flagQ. The proportion of papers that concern time
series forecasting has been fairly stable over time. We
also review key papers and books published else-
25 years of time
Jan G. De Gooijer a
a Department of Quantitative Economics, University of Ams
b Department of Econometrics and Business S
0169-2070/$ - see front matter D 2006 International Institute of Forecaste
doi:10.1016/j.ijforecast.2006.01.001
E-mail ad
Rob.Hyndman@buseco.monash.edu.au (R.J. Hyndman).
1 Tel.: +31 20 525 4244; fax: +31 20 525 4349.
ries forecasting
ob J. Hyndman b,*
, Roetersstraat 11, 1018 WB Amsterdam, The Netherlands
ics, Monash University, VIC 3800, Australia
www.elsevier.com/locate/ijforecast
dresses: j.g.degooijer@uva.nl (J.G. De Gooijer),
rs. Published by Elsevier B.V. All rights reserved.
2 The IIF was involved with JoF issue 44:1 (1985).
only a few cases was a subjective decision needed on
our part to classify a paper under a particular section
nal J
heading. To facilitate a quick overview in a particular
field, the papers are listed in alphabetical order under
each of the section headings.
Determining what to include and what not to
include in the list of references has been a problem.
There may be papers that we have missed and papers
that are also referenced by other authors in this Silver
Anniversary issue. As such the review is somewhat
bselectiveQ, although this does not imply that a
particular paper is unimportant if it is not reviewed.
The review is not intended to be critical, but rather
a (brief) historical and personal tour of the main
developments. Still, a cautious reader may detect
certain areas where the fruits of 25 years of intensive
research interest has been limited. Conversely, clear
explanations for many previously anomalous time
series forecasting results have been provided by the
end of 2005. Section 13 discusses some current
research directions that hold promise for the future,
but of course the list is far from exhaustive.
2. Exponential smoothing
2.1. Preamble
Twenty-five years ago, exponential smoothing
methods were often considered a collection of ad
hoc techniques for extrapolating various types of
univariate time series. Although exponential smooth-
ing methods were widely used in business and
industry, they had received little attention from
statisticians and did not have a well-developed
comprise 380 journal papers and 20 books and
monographs.
It was felt to be convenient to first classify the
papers according to the models (e.g., exponential
smoothing, ARIMA) introduced in the time series
literature, rather than putting papers under a heading
associated with a particular method. For instance,
Bayesian methods in general can be applied to all
models. Papers not concerning a particular model
were then classified according to the various problems
(e.g., accuracy measures, combining) they address. In
J.G. De Gooijer, R.J. Hyndman / Internatio444
statistical foundation. These methods originated in
the 1950s and 1960s with the work of Brown (1959,
1963), Holt (1957, reprinted 2004), and Winters
(1960). Pegels (1969) provided a simple but useful
classification of the trend and the seasonal patterns
depending on whether they are additive (linear) or
multiplicative (nonlinear).
Muth (1960) was the first to suggest a statistical
foundation for simple exponential smoothing (SES)
by demonstrating that it provided the optimal fore-
casts for a random walk plus noise. Further steps
towards putting exponential smoothing within a
statistical framework were provided by Box and
Jenkins (1970), Roberts (1982), and Abraham and
Ledolter (1983, 1986), who showed that some linear
exponential smoothing forecasts arise as special cases
of ARIMA models. However, these results did not
extend to any nonlinear exponential smoothing
methods.
Exponential smoothing methods received a boost
from two papers published in 1985, which laid the
foundation for much of the subsequent work in this
area. First, Gardner (1985) provided a thorough
review and synthesis of work in exponential smooth-
ing to that date and extended Pegels’ classification to
include damped trend. This paper brought together a
lot of existing work which stimulated the use of these
methods and prompted a substantial amount of
additional research. Later in the same year, Snyder
(1985) showed that SES could be considered as
arising from an innovation state space model (i.e., a
model with a single source of error). Although this
insight went largely unnoticed at the time, in recent
years it has provided the basis for a large amount of
work on state space models underlying exponential
smoothing methods.
Most of the work since 1980 has involved studying
the empirical properties of the methods (e.g., Barto-
lomei & Sweet, 1989; Makridakis & Hibon, 1991),
proposals for new methods of estimation or initiali-
zation (Ledolter & Abraham, 1984), evaluation of the
forecasts (McClain, 1988; Sweet & Wilson, 1988), or
has concerned statistical models that can be consid-
ered to underly the methods (e.g., McKenzie, 1984).
The damped multiplicative methods of Taylor (2003)
provide the only genuinely new exponential smooth-
ing methods over this period. There have, of course,
been numerous studies applying exponential smooth-
ournal of Forecasting 22 (2006) 443–473
ing methods in various contexts including computer
components (Gardner, 1993), air passengers (Grubb &
nal J
Masa, 2001), and production planning (Miller &
Liberatore, 1993).
The Hyndman, Koehler, Snyder, and Grose (2002)
taxonomy (extended by Taylor, 2003) provides a
helpful categorization for describing the various
methods. Each method consists of one of five types
of trend (none, additive, damped additive, multiplica-
tive, and damped multiplicative) and one of three
types of seasonality (none, additive, and multiplica-
tive). Thus, there are 15 different methods, the best
known of which are SES (no trend, no seasonality),
Holt’s linear method (additive trend, no seasonality),
Holt–Winters’ additive method (additive trend, addi-
tive seasonality), and Holt–Winters’ multiplicative
method (additive trend, multiplicative seasonality).
2.2. Variations
Numerous variations on the original methods have
been proposed. For example, Carreno and Madina-
veitia (1990) and Williams and Miller (1999) pro-
posed modifications to deal with discontinuities, and
Rosas and Guerrero (1994) looked at exponential
smoothing forecasts subject to one or more con-
straints. There are also variations in how and when
seasonal components should be normalized. Lawton
(1998) argued for renormalization of the seasonal
indices at each time period, as it removes bias in
estimates of level and seasonal components. Slightly
different normalization schemes were given by
Roberts (1982) and McKenzie (1986). Archibald
and Koehler (2003) developed new renormalization
equations that are simpler to use and give the same
point forecasts as the original methods.
One useful variation, part way between SES and
Holt’s method, is SES with drift. This is equivalent to
Holt’s method with the trend parameter set to zero.
Hyndman and Billah (2003) showed that this method
was also equivalent to Assimakopoulos and Nikolo-
poulos (2000) bTheta methodQ when the drift param-
eter is set to half the slope of a linear trend fitted to the
data. The Theta method performed extremely well in
the M3-competition, although why this particular
choice of model and parameters is good has not yet
been determined.
There has been remarkably little work in developing
J.G. De Gooijer, R.J. Hyndman / Internatio
multivariate versions of the exponential smoothing
methods for forecasting. One notable exception is
Pfeffermann and Allon (1989) who looked at Israeli
tourism data. Multivariate SES is used for process
control charts (e.g., Pan, 2005), where it is called
bmultivariate exponentially weightedmoving averagesQ,
but here the focus is not on forecasting.
2.3. State space models
Ord, Koehler, and Snyder (1997) built on the work
of Snyder (1985) by proposing a class of innovation
state space models which can be considered as
underlying some of the exponential smoothing meth-
ods. Hyndman et al. (2002) and Taylor (2003)
extended this to include all of the 15 exponential
smoothing methods. In fact, Hyndman et al. (2002)
proposed two state space models for each method,
corresponding to the additive error and the multipli-
cative error cases. These models are not unique and
other related state space models for exponential
smoothing methods are presented in Koehler, Snyder,
and Ord (2001) and Chatfield, Koehler, Ord, and
Snyder (2001). It has long been known that some
ARIMA models give equivalent forecasts to the linear
exponential smoothing methods. The significance of
the recent work on innovation state space models is
that the nonlinear exponential smoothing methods can
also be derived from statistical models.
2.4. Method selection
Gardner and McKenzie (1988) provided some
simple rules based on the variances of differenced
time series for choosing an appropriate exponential
smoothing method. Tashman and Kruk (1996) com-
pared these rules with others proposed by Collopy and
Armstrong (1992) and an approach based on the BIC.
Hyndman et al. (2002) also proposed an information
criterion approach, but using the underlying state
space models.
2.5. Robustness
The remarkably good forecasting performance of
exponential smoothing methods has been addressed
by several authors. Satchell and Timmermann (1995)
and Chatfield et al. (2001) showed that SES is optimal
ournal of Forecasting 22 (2006) 443–473 445
for a wide range of data generating processes. In a
small simulation study, Hyndman (2001) showed that
2.7. Parameter space and model properties
It is common practice to restrict the smoothing
parameters to the range 0 to 1. However, now that
sive (AR) and moving average (MA) models. Wold’s
decomposition theorem led to the formulation and
nal Journal of Forecasting 22 (2006) 443–473
simple exponential smoothing performed better than
first order ARIMA models because it is not so subject
to model selection problems, particularly when data
are non-normal.
2.6. Prediction intervals
One of the criticisms of exponential smoothing
methods 25 years ago was that there was no way to
produce prediction intervals for the forecasts. The first
analytical approach to this problem was to assume that
the series were generated by deterministic functions of
time plus white noise (Brown, 1963; Gardner, 1985;
McKenzie, 1986; Sweet, 1985). If this was so, a
regression model should be used rather than expo-
nential smoothing methods; thus, Newbold and Bos
(1989) strongly criticized all approaches based on this
assumption.
Other authors sought to obtain prediction intervals
via the equivalence between exponential smoothing
methods and statistical models. Johnston and Harrison
(1986) found forecast variances for the simple and
Holt exponential smoothing methods for state space
models with multiple sources of errors. Yar and
Chatfield (1990) obtained prediction intervals for the
additive Holt–Winters’ method by deriving the
underlying equivalent ARIMA model. Approximate
prediction intervals for the multiplicative Holt–Win-
ters’ method were discussed by Chatfield and Yar
(1991), making the assumption that the one-step-
ahead forecast errors are independent. Koehler et al.
(2001) also derived an approximate formula for the
forecast variance for the multiplicative Holt–Winters’
method, differing from Chatfield and Yar (1991) only
in how the standard deviation of the one-step-ahead
forecast error is estimated.
Ord et al. (1997) and Hyndman et al. (2002) used
the underlying innovation state space model to
simulate future sample paths, and thereby obtained
prediction intervals for all the exponential smoothing
methods. Hyndman, Koehler, Ord, and Snyder
(2005) used state space models to derive analytical
prediction intervals for 15 of the 30 methods,
including all the commonly used methods. They
provide the most comprehensive algebraic approach
to date for handling the prediction distribution
J.G. De Gooijer, R.J. Hyndman / Internatio446
problem for the majority of exponential smoothing
methods.
solution of the linear forecasting problem of Kolmo-
gorov (1941). Since then, a considerable body of
literature has appeared in the area of time series,
dealing with parameter estimation, identification,
model checking, and forecasting; see, e.g., Newbold
(1983) for an early survey.
The publication Time Series Analysis: Forecasting
and Control by Box and Jenkins (1970)3 integrated
the existing knowledge. Moreover, these authors
developed a coherent, versatile three-stage iterative
3 The book by Box, Jenkins, and Reinsel (1994) with Gregory
Reinsel as a new co-author is an updated version of the bclassicQ
Box and Jenkins (1970) text. It includes new material on
underlying statistical models are available, the natural
(invertible) parameter space for the models can be
used instead. Archibald (1990) showed that it is
possible for smoothing parameters within the usual
intervals to produce non-invertible models. Conse-
quently, when forecasting, the impact of change in the
past values of the series is non-negligible. Intuitively,
such parameters produce poor forecasts and the
forecast performance deteriorates. Lawton (1998) also
discussed this problem.
3. ARIMA models
3.1. Preamble
Early attempts to study time series, particularly in
the 19th century, were generally characterized by the
idea of a deterministic world. It was the major
contribution of Yule (1927) which launched the notion
of stochasticity in time series by postulating that every
time series can be regarded as the realization of a
stochastic process. Based on this simple idea, a
number of time series methods have been developed
since then. Workers such as Slutsky, Walker, Yaglom,
and Yule first formulated the concept of autoregres-
intervention analysis, outlier detection, testing for unit roots, and
process control.
cycle for time series identification, estimation, and
verification (rightly known as the Box–Jenkins
approach). The book has had an enormous impact
on the theory and practice of modern time series
analysis and forecasting. With the advent of the
computer, it popularized the use of autoregressive
integrated moving average (ARIMA) models and their
extensions in many areas of science. Indeed, forecast-
ing discrete time series processes through univariate
ARIMA models, transfer function (dynamic regres-
sion) models, and multivariate (vector) ARIMA
models has generated quite a few IJF papers. Often
these studies were of an empirical nature, using one or
more benchmark methods/models as a comparison.
Without pretending to be complete, Table 1 gives a list
of these studies. Naturally, some of these studies are
more successful than others. In all cases, the
forecasting experiences reported are valuable. They
have also been the key to new developments, which
may be summarized as follows.
3.2. Univariate
The success of the Box–Jenkins methodology is
founded on the fact that the various models can,
between them, mimic the behaviour of diverse types
of series—and do so adequately without usually
requiring very many parameters to be estimated in
the final choice of the model. However, in the mid-
sixties, the selection of a model was very much a
matter of the researcher’s judgment; there was no
algorithm to specify a model uniquely. Since then,
Table 1
A list of examples of real applications
Dataset Forecast horizon Benchmark Reference
Univariate ARIMA
Electricity load (min) 1–30 min Wiener filter Di Caprio, Genesio, Pozzi, and Vicino
(1983)
Quarterly automobile insurance
paid claim costs
8 quarters Log-linear regression Cummins and Griepentrog (1985)
Daily federal funds rate 1 day Random walk Hein and Spudeck (1988)
Quarterly macroeconomic data 1–8 quarters Wharton model Dhrymes and Peristiani (1988)
Monthly department store sales 1 month Simple exponential smoothing Geurts and Kelly (1986, 1990),
Pack (1990)
Monthly demand for telephone services 3 years Univariate state space Grambsch and Stahel (1990)
ograp
ariate
ivaria
ariate
–Win
ariate
ariate
ariate
ariate
ressio
fer fu
ment
MA
ariate
J.G. De Gooijer, R.J. Hyndman / International Journal of Forecasting 22 (2006) 443–473 447
Yearly population totals 20–30 years Dem
Monthly tourism demand 1–24 months Univ
mult
Dynamic regression/transfer function
Monthly telecommunications traffic 1 month Univ
Weekly sales data 2 years n.a.
Daily call volumes 1 week Holt
Monthly employment levels 1–12 months Univ
Monthly and quarterly consumption
of natural gas
1 month/1 quarter Univ
Monthly electricity consumption 1–3 years Univ
VARIMA
Yearly municipal budget data Yearly (in-sample) Univ
Monthly accounting data 1 month Reg
trans
Quarterly macroeconomic data 1–10 quarters Judg
ARI
Monthly truck sales 1–13 months Univ
Monthly hospital patient movements 2 years Univariate
Quarterly unemployment rate 1–8 quarters Transfer f
hic models Pflaumer (1992)
state space,
te state space
du Preez and Witt (2003)
ARIMA Layton, Defris, and Zehnwirth (1986)
Leone (1987)
ters Bianchi, Jarrett, and Hanumara (1998)
ARIMA Weller (1989)
ARIMA Liu and Lin (1991)
ARIMA Harris and Liu (1993)
ARIMA Downs and Rocke (1983)
n, univariate, ARIMA,
nction
Hillmer, Larcker, and Schroeder (1983)
al methods, univariate O¨ller (1985)
ARIMA, Holt–Winters Heuts and Bronckers (1988
本文档为【25 years of time series forecasting】,请使用软件OFFICE或WPS软件打开。作品中的文字与图均可以修改和编辑,
图片更改请在作品中右键图片并更换,文字修改请直接点击文字进行修改,也可以新增和删除文档中的内容。
该文档来自用户分享,如有侵权行为请发邮件ishare@vip.sina.com联系网站客服,我们会及时删除。
[版权声明] 本站所有资料为用户分享产生,若发现您的权利被侵害,请联系客服邮件isharekefu@iask.cn,我们尽快处理。
本作品所展示的图片、画像、字体、音乐的版权可能需版权方额外授权,请谨慎使用。
网站提供的党政主题相关内容(国旗、国徽、党徽..)目的在于配合国家政策宣传,仅限个人学习分享使用,禁止用于任何广告和商用目的。