Geostatistical approaches for incorporating elevation into the
spatial interpolation of rainfall
P. Goovaerts*
Department of Civil and Environmental Engineering, The University of Michigan, Ann Arbor, MI 48109-2125, USA
Received 11 February 1999; received in revised form 11 November 1999; accepted 16 December 1999
Abstract
This paper presents three multivariate geostatistical algorithms for incorporating a digital elevation model into the spatial
prediction of rainfall: simple kriging with varying local means; kriging with an external drift; and colocated cokriging. The
techniques are illustrated using annual and monthly rainfall observations measured at 36 climatic stations in a 5000 km2 region
of Portugal. Cross validation is used to compare the prediction performances of the three geostatistical interpolation algorithms
with the straightforward linear regression of rainfall against elevation and three univariate techniques: the Thiessen polygon;
inverse square distance; and ordinary kriging.
Larger prediction errors are obtained for the two algorithms (inverse square distance, Thiessen polygon) that ignore both the
elevation and rainfall records at surrounding stations. The three multivariate geostatistical algorithms outperform other inter-
polators, in particular the linear regression, which stresses the importance of accounting for spatially dependent rainfall
observations in addition to the colocated elevation. Last, ordinary kriging yields more accurate predictions than linear regres-
sion when the correlation between rainfall and elevation is moderate (less than 0.75 in the case study). q 2000 Elsevier Science
B.V. All rights reserved.
Keywords: Rainfall; DEM; Multivariate geostatistics; Kriging
1. Introduction
Measured rainfall data are important to many
problems in hydrologic analysis and designs. For
example the ability of obtaining high resolution esti-
mates of spatial variability in rainfall fields becomes
important for identification of locally intense storms
which could lead to floods and especially to flash
floods. The accurate estimation of the spatial distribu-
tion of rainfall requires a very dense network of
instruments, which entails large installation and
operational costs. Also, vandalism or the failure of
the observer to make the necessary visit to the gage
may result in even lower sampling density. It is thus
necessary to estimate point rainfall at unrecorded
locations from values at surrounding sites.
A number of methods have been proposed for the
interpolation of rainfall data. The simplest approach
consists of assigning to the unsampled location the
record of the closest gage (Thiessen, 1911). This
method amounts at drawing around each gage a poly-
gon of influence with the boundaries at a distance
halfway between gage pairs, hence the name Thiessen
polygon for the technique. Although the Thiessen
polygon method is essentially used for estimation of
areal rainfall (McCuen, 1998), it has also been applied
to the interpolation of point measurements (Creutin
Journal of Hydrology 228 (2000) 113–129
0022-1694/00/$ - see front matter q 2000 Elsevier Science B.V. All rights reserved.
PII: S0022-1694(00)00144-X
www.elsevier.com/locate/jhydrol
* Tel.: 11-734-936-0141; fax: 11-734-763-2275.
E-mail address: goovaert@engin.umich.edu (P. Goovaerts).
and Obled, 1982; Tabios and Salas, 1985; Dirks et al.,
1998). In 1972, the US National Weather Service has
developed another method whereby the unknown
rainfall depth is estimated as a weighted average of
surrounding values, the weights being reciprocal to
the square distances from the unsampled location
(Bedient and Huber, 1992, p. 25). Like the Thiessen
polygon method, the inverse square distance tech-
nique does not allow the hydrologist to consider
factors, such as topography, that can affect the catch
at a gage. The isohyetal method (McCuen, 1998,
p. 190) is designed to overcome this deficiency. The
idea is to use the location and catch for each gage, as
well as knowledge of the factors affecting these
catches, to draw lines of equal rainfall depth (isohyets).
The amount of rainfall at the unsampled location is
then estimated by interpolation within the isohyets. A
limitation of the technique is that an extensive gage
network is required to draw isohyets accurately.
Geostatistics, which is based on the theory of
regionalized variables (Journel and Huijbregts,
1978; Goovaerts, 1997, 1999), is increasingly
preferred because it allows one to capitalize on the
spatial correlation between neighboring observations
to predict attribute values at unsampled locations.
Several authors (Tabios and Salas, 1985; Phillips et
al., 1992) have shown that the geostatistical prediction
technique (kriging) provides better estimates of rain-
fall than conventional methods. Recently, Dirks et al.
(1998) found that the results depend on the sampling
density and that, for high-resolution networks (e.g. 13
raingages over a 35 km2 area), the kriging method
does not show significantly greater predictive skill
than simpler techniques, such as the inverse square
distance method. Similar results were found by
Borga and Vizzaccaro (1997) when they compared
kriging and multiquadratic surface fitting for various
gage densities. In fact, besides providing a measure of
P. Goovaerts / Journal of Hydrology 228 (2000) 113–129114
Fig. 1. Location of the study area and positions of the 36 climatic stations.
prediction error (kriging variance), a major advantage
of kriging over simpler methods is that the sparsely
sampled observations of the primary attribute can be
complemented by secondary attributes that are more
densely sampled. For rainfall, secondary information
can take the form of weather-radar observations. A
multivariate extension of kriging, known as cokriging,
has been used for merging raingage and radar-rainfall
data (Creutin et al., 1988; Azimi-Zonooz et al., 1989).
Raspa et al. (1997) used another geostatistical tech-
nique, kriging with an external drift, to combine both
types of information. In this paper, another valuable
and cheaper source of secondary information is
considered: digital elevation model (DEM). Precipi-
tation tends to increase with increasing elevation,
mainly because of the orographic effect of moun-
tainous terrain, which causes the air to be lifted verti-
cally, and the condensation occurs due to adiabatic
cooling. For example Hevesi et al. (1992a,b) reported
a significant 0.75 correlation between average annual
precipitation and elevation recorded at 62 stations in
Nevada and southeastern California. In their paper,
they used a multivariate version of kriging, called
cokriging, to incorporate elevation into the mapping
of rainfall. A more straightforward approach consists
of estimating rainfall at a DEM grid cell through a
regression of rainfall versus elevation (Daly et al.,
1994).
In this paper, annual and monthly rainfall data from
the Algarve region (Portugal) are interpolated using
two types of techniques: (1) methods that use only
rainfall data recorded at 36 stations (the Thiessen
polygon, inverse square distance, and ordinary
kriging); and (2) algorithms that combine rainfall
data with a digital elevation model (linear regression,
simple kriging with varying local means, kriging with
an external drift, colocated ordinary cokriging).
Prediction performances of the different algorithms are
compared using cross validation and are related to the
strength of the correlation between rainfall and elevation,
and the pattern of spatial dependence of rainfall.
2. Case study
The Algarve is the most southern region of Portu-
gal, with an area of approximately 5000 km2. Fig. 1
shows the location of 36 daily read raingage stations
used in this study. The monthly and annual rainfall
depths have been averaged over the period of January
1970–March 1995, and basic sample statistics (mean,
standard deviation, minimum, maximum) are given in
Table 1. The subsequent analysis will be conducted on
these averaged data, hence fluctuations of monthly
and annual precipitations from one year to another
will not be investigated.
P. Goovaerts / Journal of Hydrology 228 (2000) 113–129 115
Table 1
Statistics for the monthly and annual rainfall data (36 observations). The last three columns give the linear correlation coefficient between
rainfall and elevation, and the mean absolute error (MAE) and mean square error (MSE) of prediction of rainfall by linear regression of
elevation
Period Rainfall (mm) Correlation MAE MSE
Mean Std. dev. Min. Max.
Jan 69.9 24.7 37.6 137.0 0.69 13.9 317
Feb 58.0 25.7 27.4 146.6 0.75 12.5 287
Mar 32.7 11.4 17.2 73.9 0.80 5.05 47.4
Apr 42.1 17.3 17.9 105.9 0.74 8.91 135
May 21.0 9.8 7.2 54.6 0.83 4.30 30.1
Jun 8.1 3.7 3.2 16.8 0.83 1.71 4.33
Jul 1.0 0.8 0.0 3.2 0.39 0.66 0.61
Aug 2.0 1.2 0.0 5.3 0.33 0.85 1.21
Sep 12.1 4.1 5.6 22.8 0.75 2.24 7.56
Oct 59.6 16.1 32.2 111.0 0.76 8.38 111
Nov 79.3 22.0 39.4 148.7 0.72 11.1 231
Dec 96.3 33.7 44.4 183.0 0.71 19.9 562
Annual 482.1 159.8 259.5 1005 0.79 76.2 9774
Another source of information is the elevation map
shown in Fig. 2. Each grid cell represents 1 km2 and
its elevation was computed as the average of the
elevations at 4 discrete points within the cell. The
relief is dominated by the two main Algarve’s moun-
tains: the Monchique (left) and the Caldeira˜o (right).
Table 1 indicates that the correlation between rainfall
and elevation ranges from 0.33 to 0.83, hence it seems
worth accounting for this exhaustive secondary infor-
mation into the mapping of rainfall. The control of
elevation on the spatial distribution of rainfall
explains the moderate to strong correlation between
monthly rainfall data, see Table 2. Apart from the two
dry months of July and August the correlation ranges
from 0.50 to 0.97. The correlation coefficients of
Table 2 have been averaged as a function of the
time lag between months, excluding July and August.
Fig. 3 shows that the average correlation between
rainfall measured over two consecutive months
lag
1 is 0.9 and slightly decreases to 0.8 as the separation
time increases.
3. Interpolation procedures
This section briefly introduces the different esti-
mators used in the case study. Interested readers
should refer to Goovaerts (1997) for a detailed presen-
tation of the different kriging algorithms, and Deutsch
and Journel (1998) for their implementation in the
public-domain Geostatistical Software Library
(Gslib).
3.1. Univariate estimation
Consider first the problem of estimating the rainfall
depth z at an unsampled location u using only rainfall
data. Let {z
ua;a 1;…; n} be the set of rainfall
data measured at n 36 locations ua .
The most straightforward approach is the Thiessen
P. Goovaerts / Journal of Hydrology 228 (2000) 113–129116
Fig. 2. Digital elevation model.
Table 2
Matrix of linear correlation coefficients between monthly rainfall data (36 observations)
Period Jan Feb Mar Apr May Jun Jul Aug Sep Oct Nov Dec
Jan 1.00 0.97 0.86 0.85 0.84 0.78 0.23 0.31 0.58 0.88 0.94 0.94
Feb 1.00 0.92 0.91 0.87 0.81 0.22 0.42 0.64 0.92 0.96 0.93
Mar 1.00 0.96 0.89 0.80 0.20 0.47 0.71 0.89 0.89 0.81
Apr 1.00 0.91 0.82 0.20 0.42 0.71 0.91 0.89 0.78
May 1.00 0.88 0.33 0.36 0.75 0.90 0.85 0.78
Jun 1.00 0.38 0.34 0.77 0.84 0.79 0.72
Jul 1.00 20.04 0.26 0.12 0.24 0.34
Aug 1.00 0.46 0.37 0.35 0.31
Sep 1.00 0.75 0.65 0.50
Oct 1.00 0.89 0.81
Nov 1.00 0.91
Dec 1.00
Fig. 3. Average correlation between monthly rainfall data measured
at increasing time intervals: 1–5 months.
polygon method whereby the value of the closest
observation is simply assigned to u:
zpPol
u z
ua 0 with uu 2 ua 0 u , uu 2 uau ;a ± a 0:
1
Fig. 4 (2nd graph) shows the map of annual rainfall
interpolated at the nodes of a 1 £ 1 km2 grid corre-
sponding to the resolution of the elevation model. The
map displays the characteristic polygonal zones of
influence around the 36 gages.
To avoid unrealistic patchy maps, the depth z can be
estimated as a linear combination of several surround-
ing observations, with the weights being inversely
proportional to the square distance between obser-
vations and u:
zpInv
u 1Xn
u
a1
la
u
Xn
u
a1
la
uz
ua
with la
u 1
uu 2 uau
2
2
Fig. 4 (3rd graph) shows the map of annual rainfall
produced using the inverse square distance method
and n
u 16 surrounding observations.
The basic idea behind the weighting scheme (2) is
that observations that are close to each other on the
ground tend to be more alike than those further apart,
hence observations closer to u should receive a larger
weight. Instead of the Euclidian distance, geostatistics
uses the semivariogram as a measure of dissimilarity
between observations. The experimental semivario-
gram g^
h is computed as half the average squared
difference between the components of data pairs:
g^
h 1
2N
h
XN
h
a1
z
ua2 z
ua 1 h2;
3
where N(h) is the number of pairs of data locations a
vector h apart. The semivariogram is a function of
both the distance and direction, and so it can account
for direction-dependent variability (anisotropic spatial
pattern).
Fig. 5 (top graph) shows the semivariogram of
annual rainfall computed from the 36 data of Fig. 4.
Because of the lack of data only the omnidirectional
semivariogram was computed, and hence the spatial
variability is assumed to be identical in all directions.
Semivariogram values increase with the separation
distance, reflecting our intuitive feeling that two rain-
fall data close to each other on the ground are more
alike, and thus their squared difference is smaller, than
those that are further apart. The semivariogram
reaches a maximum at 25 km before dipping and fluc-
tuating around a sill value. The so-called “hole effect”
typically reflects pseudo-periodic or cyclic phenom-
ena (Journel and Huijbregts, 1978, p. 403). Here, the
P. Goovaerts / Journal of Hydrology 228 (2000) 113–129 117
Fig. 4. Annual rainfall maps obtained by interpolation of 36 obser-
vations (top map) using the Thiessen polygon, inverse square
distance, and ordinary kriging.
hole effect relates to the existence of two mountains
40 km apart (recall Fig. 2) which creates two high-
valued areas in the rainfall field.
Kriging is a generalized least-square regression
technique that allows one to account for the spatial
dependence between observations, as revealed by the
semivariogram, into spatial prediction. Most of geos-
tatistics is based on the concept of a random function,
whereby the set of unknown values is regarded as a set
of spatially dependent random variables. Each
measurement z
ua is thus interpreted as a particular
realization of a random variable Z
ua. Interested
readers should refer to textbooks such as Isaaks and
Srivastava (1989, pp. 196–236) or Goovaerts (1997,
pp. 59–74) for a detailed presentation of the theory of
random functions. Like the inverse square distance
method, geostatistical interpolation amounts at esti-
mating the unknown rainfall depth z at the unsampled
location u as a linear combination of neighboring
observations:
zpOK
u
Xn
u
a1
lOKa
uz
ua with
Xn
u
a1
lOKa
u 1:
4
The ordinary kriging weights lOKa
u are determined
such as to minimize the estimation variance,
Var{ZpOK
u2 Z
u}; while ensuring the unbiased-
ness of the estimator, E{ZpOK
u2 Z
u} 0: These
weights are obtained by solving a system of linear
equations which is known as “ordinary kriging
system”:
Xn
u
b1
lb
ug
ua 2 ub2 m
u g
ua 2 u a 1;…; n
u
Xn
u
b1
lb
u 1
8>>>>><>>>>>:
5
where m (u) is the Lagrange parameter accounting for
the constraint on the weights. The only information
required by the kriging system (5) are semivariogram
values for different lags, and these are readily derived
once a semivariogram model has been fitted to experi-
mental values. Fig. 5 shows three different types of
permissible models that are combined with a nugget-
effect model for the fitting of the experimental semi-
variogram of annual rainfall:
1. Spherical model with range a
g
h 1:5
h
a
2 0:5 h
a
� �3
if h # a
1 otherwise
8><>:
6
P. Goovaerts / Journal of Hydrology 228 (2000) 113–129118
Fig. 5. Experimental semivariogram of the annual rainfall with three
different permissible models fitted.
2. Cubic model with range a
g
h
7
h
a
� �2
28:75 h
a
� �3
13:5 h
a
� �5
20:75 h
a
� �7
if h # a
1 otherwise
8>>>>>><>>>>>>:
7
3. Dampened hole effect model
g
h 1:0 2 exp 23hd
� �
cos
h
a
p
� �
8
where d is the distance at which 95% of the hole
effect is dampened out.
The spherical model is the most widely used semi-
variogram model and is characterized by a linear
behavior at the origin. The cubic model (Galli et al.,
1984; Chile`s and Delfiner, 1999, p. 84) displays a
parabolic behavior at the origin and is preferred to
the Gaussian model because it avoids numerical
unstability in kriging system (Wackernagel, 1998, p.
120). Although no direct information is available on
the variability of rainfall over very short distances (the
first semivariogram value corresponds to a lag of
5.9 km), ancillary information, such as the more
detailed semivariogram of a correlated attribute like
altitude (Fig. 6), suggests that one can expect a para-
bolic behavior for the first lags. Note that a very regu-
lar behavior near the origin can be combined with a
nugget effect, the latter reflecting measurement errors
that are superimposed on the underlying continuous
phenomenon. The last type of function is more
complex and used to model hole effect (Deutsch and
Journel, 1998, p. 26). The three models have been
fitted using regression and are such that the weighted
sum of squares (WSS) of differences between experi-
mental g^
hk and model g
hk semivariogram values
is minimum:
WSS
XK
k1
v
hkg^
hk2 g
hk2
9
The weights were taken as N
hk=g
hk2 in order
to give more importance to the first lags and the ones
computed from more data pairs. The cubic model has
been retained because it yields the smallest WSS
value while being more parsimonious (less parameters
to estimate) than the dampened hole effect model. Fig.
4 (bottom graph) shows the rainfall map produced by
ordinary kriging using the cubic semivariogram
model and the 16 closest observations at each grid
node. Like the inverse square distance method, the
rainfall map is quite crude, which stresses the
importance of accounting for more densely
sampled information, such as the digital elevation
model of Fig. 2.
3.2. Accounting for elevation
Consider now the situation where the rainfall data
{z
ua;a 1;…; n} are supplemented by elevation
data available at all estimation grid nodes and denoted
y(u).
A straightforward approach consists of predicting
the rainfall as a function of the co-located elevation,
e.g. using a linear relation such as:
zp
u f
y
u ap0 1 ap1y
u
10
where the two regression coefficients ap0; ap1 are esti-
mated from the set of collocated rainfall and elevation
data {
z
ua; y
ua; a 1;…; n}: For example, the
relation between annual rainfall and elevation was
modeled as z
u 324:1 1 0:922y
u R2 0:62;
leading to the rainfall map shown at the top of Fig.
7. A major shortcoming of this type of regression is
that the rainfall at a particular grid node u is derived
only from the elevation at u, regardless of the records
P. Goovaerts / Journal of Hydrology 228 (2000) 113–129 119
Fig. 6. Experimental semivariogram of elevation computed from the
DEM of Fig. 2.
at the surrounding raingages ua . Such an approach
amounts at assuming that the residual values r
ua
z
ua2 f
y
ua are spatially uncorrelated. Spatial
correlation of the residuals or of the rainfall
observations can be taken into account using
the three types of geostatistical algorithms
described below.
Simple kriging with varying local means (SKlm)
amounts at replacing the known stationary mean in
the
本文档为【Geostatistical approaches for incorporating elevation into the】,请使用软件OFFICE或WPS软件打开。作品中的文字与图均可以修改和编辑,
图片更改请在作品中右键图片并更换,文字修改请直接点击文字进行修改,也可以新增和删除文档中的内容。
该文档来自用户分享,如有侵权行为请发邮件ishare@vip.sina.com联系网站客服,我们会及时删除。
[版权声明] 本站所有资料为用户分享产生,若发现您的权利被侵害,请联系客服邮件isharekefu@iask.cn,我们尽快处理。
本作品所展示的图片、画像、字体、音乐的版权可能需版权方额外授权,请谨慎使用。
网站提供的党政主题相关内容(国旗、国徽、党徽..)目的在于配合国家政策宣传,仅限个人学习分享使用,禁止用于任何广告和商用目的。