Characterising and visualizing spatio-temporal patterns in hourly...

11
ORIGINAL PAPER Characterising and visualizing spatio-temporal patterns in hourly precipitation records Agne Burauskaite-Harju & Anders Grimvall & Christine Achberger & Alexander Walther & Deliang Chen Received: 4 August 2011 /Accepted: 12 December 2011 # Springer-Verlag 2012 Abstract We develop new techniques to summarise and visualise spatial patterns of coincidence in weather events such as more or less heavy precipitation at a network of meteorological stations. The cosine similarity measure, which has a simple probabilistic interpretation for vectors of binary data, is generalised to characterise spatial depen- dencies of events that may reach different stations with a variable time lag. More specifically, we reduce such patterns into three parameters (dominant time lag, maximum cross- similarity, and window-maximum similarity) that can easily be computed for each pair of stations in a network. Further- more, we visualise such three-parameter summaries by us- ing colour-coded maps of dependencies to a given reference station and distance-decay plots for the entire network. Applications to hourly precipitation data from a network of 93 stations in Sweden illustrate how this method can be used to explore spatial patterns in the temporal synchrony of precipitation events. 1 Introduction In climate research, considerable attention has long been given to spatio-temporal characteristics of precipitation, temperature, pressure, and other meteorological variables. Knowledge of such features is critical to reveal dynamics of the physical processes responsible for these patterns (e.g. Gong et al. 2007) and also to estimate values for unsampled locations and support assimilation (e.g. Johansson and Chen 2005). Moreover, it can facilitate development of stochastic climate models and downscaling models of global climate information to regional and local scales (e.g. Haberlandt et al. 2008; Zheng and Katz 2008; Baigorria et al. 2007; Brommundt and Bárdossy 2007; Yang et al. 2005; Buishand and Brandsma 2001; Wilks 1998). Here, we focus on the need for new techniques to summarise and visualise spatio-temporal patterns in sub-daily meteorological da- ta, particularly hourly precipitation data from a network of stations. A common approach to summarizing spatial dependencies in data from a network of stations is to compute the ordinary Pearsons correlation for each pair of stations (Robeson and Shein 1997; Gunst 1995). Such coefficients can then be plot- ted against distance between stations to illustrate the distance- decay relationships for precipitation (Osborn and Hulme 1997), temperature (Jones et al. 1997), surface winds (Achberger et al. 2006), and other meteorological vari- ables. Strong correlations in daily, monthly, and yearly data can persist over relatively large distances, whereas patterns in distance-decay rates for sub-daily data are normally not as clear. Aghakouchak et al. (2010) noted that occurrence of large amounts of hourly precipitation can exhibit strong spatial dependence for sites up to 10 km apart, whereas other studies have indicated rela- tively weak dependencies (Serinaldi 2008; Garcia et al. 2002; Barbaliscia et al. 1992). The low correlation of contemporaneous sub-daily pre- cipitation data from different meteorological stations can often be explained by the fact that synoptic events such as fronts reach the stations with some time lag. Garcia and co- A. Burauskaite-Harju (*) : A. Grimvall Division of Statistics, Department of Computer and Information Science, Linköping University, 58183 Linköping, Sweden e-mail: [email protected] C. Achberger : A. Walther : D. Chen Department of Earth Sciences, University of Gothenburg, Gothenburg, Sweden Theor Appl Climatol DOI 10.1007/s00704-011-0574-x

Transcript of Characterising and visualizing spatio-temporal patterns in hourly...

Page 1: Characterising and visualizing spatio-temporal patterns in hourly …rcg.gvc.gu.se/dc/PUBs/Burauskaiteeta12.pdf · 2020-04-16 · Characterising and visualizing spatio-temporal patterns

ORIGINAL PAPER

Characterising and visualizing spatio-temporal patternsin hourly precipitation records

Agne Burauskaite-Harju & Anders Grimvall &Christine Achberger & Alexander Walther &

Deliang Chen

Received: 4 August 2011 /Accepted: 12 December 2011# Springer-Verlag 2012

Abstract We develop new techniques to summarise andvisualise spatial patterns of coincidence in weather eventssuch as more or less heavy precipitation at a network ofmeteorological stations. The cosine similarity measure,which has a simple probabilistic interpretation for vectorsof binary data, is generalised to characterise spatial depen-dencies of events that may reach different stations with avariable time lag. More specifically, we reduce such patternsinto three parameters (dominant time lag, maximum cross-similarity, and window-maximum similarity) that can easilybe computed for each pair of stations in a network. Further-more, we visualise such three-parameter summaries by us-ing colour-coded maps of dependencies to a given referencestation and distance-decay plots for the entire network.Applications to hourly precipitation data from a networkof 93 stations in Sweden illustrate how this method can beused to explore spatial patterns in the temporal synchrony ofprecipitation events.

1 Introduction

In climate research, considerable attention has long beengiven to spatio-temporal characteristics of precipitation,temperature, pressure, and other meteorological variables.

Knowledge of such features is critical to reveal dynamics ofthe physical processes responsible for these patterns (e.g.Gong et al. 2007) and also to estimate values for unsampledlocations and support assimilation (e.g. Johansson and Chen2005). Moreover, it can facilitate development of stochasticclimate models and downscaling models of global climateinformation to regional and local scales (e.g. Haberlandt etal. 2008; Zheng and Katz 2008; Baigorria et al. 2007;Brommundt and Bárdossy 2007; Yang et al. 2005; Buishandand Brandsma 2001; Wilks 1998). Here, we focus onthe need for new techniques to summarise and visualisespatio-temporal patterns in sub-daily meteorological da-ta, particularly hourly precipitation data from a networkof stations.

A common approach to summarizing spatial dependenciesin data from a network of stations is to compute the ordinaryPearson’s correlation for each pair of stations (Robeson andShein 1997; Gunst 1995). Such coefficients can then be plot-ted against distance between stations to illustrate the distance-decay relationships for precipitation (Osborn and Hulme1997), temperature (Jones et al. 1997), surface winds(Achberger et al. 2006), and other meteorological vari-ables. Strong correlations in daily, monthly, and yearlydata can persist over relatively large distances, whereaspatterns in distance-decay rates for sub-daily data arenormally not as clear. Aghakouchak et al. (2010) notedthat occurrence of large amounts of hourly precipitationcan exhibit strong spatial dependence for sites up to10 km apart, whereas other studies have indicated rela-tively weak dependencies (Serinaldi 2008; Garcia et al.2002; Barbaliscia et al. 1992).

The low correlation of contemporaneous sub-daily pre-cipitation data from different meteorological stations canoften be explained by the fact that synoptic events such asfronts reach the stations with some time lag. Garcia and co-

A. Burauskaite-Harju (*) :A. GrimvallDivision of Statistics, Department of Computer and InformationScience, Linköping University,58183 Linköping, Swedene-mail: [email protected]

C. Achberger :A. Walther :D. ChenDepartment of Earth Sciences, University of Gothenburg,Gothenburg, Sweden

Theor Appl ClimatolDOI 10.1007/s00704-011-0574-x

Page 2: Characterising and visualizing spatio-temporal patterns in hourly …rcg.gvc.gu.se/dc/PUBs/Burauskaiteeta12.pdf · 2020-04-16 · Characterising and visualizing spatio-temporal patterns

workers (2002) studied cross-correlations between measure-ments of hourly precipitation in Spain and demonstrated thatthe strongest relationship is often found with a time lag thattends to increase with increasing distance. However, themean time lag being a function of distance alone is rarelyso straightforward.

Here, we take the concept of cosine similarity (Tan et al.2006) as the point of departure for a comprehensive yetsimple procedure to describe spatio-temporal patterns insub-daily weather data. This concept is widely used in datamining to quantify the similarity of two vectors, and weapply it to vectors representing the occurrence of extremeprecipitation events. In analogy with Pearson’s cross-correlation, we also introduce the concept of cross-similarity. Furthermore, we present a procedure in whichcross-similarities over a study area are summarised andvisualised by introducing three representative parameters:the dominant time lag, the maximum cross-similarity, and aparameter we refer to as the window-maximum similarity.

In this study, we test our methods on a set of hourlyprecipitation data collected in Sweden. The climate of Scan-dinavia is highly influenced by the proximity of the regionto the North Atlantic and the Baltic Sea, which gives rise tomaritime conditions. Annual mean precipitation in Swedenranges from 500 to 800 mm, with larger amounts in themountainous areas in the northern part of the country (up to2,000 mm) and in the south-west (1,000–1,200 mm), as wellas along the eastern coast where rainfall is enhanced in someareas with rather steep drops (Vedin and Raab 1995). Sum-mer precipitation is often characterised by heavy rainfall andhail events that are related to augmented convective activitythat gives substantial amounts of short-term precipitation(SMHI 2009). Precipitation is strongly linked to the large-scale atmospheric circulation over Northern Europe and theNorth Atlantic Region. The Icelandic Low, the Azores High,and the winter high and summer low over Russia govern thetracks and characteristics of the pressure systems passingthrough the region. Orography and the presence of thecoastline further differentiate the regional climate variabili-ty, and the North Atlantic Oscillation causes such variabilityon varying time scales, especially during winter (e.g. Hurreland Deser 2009; Hurrel et al. 2003; Linderson 2003; Uvo2003). Precipitation over Sweden is also affected by factorssuch as a cyclonic/anti-cyclonic structure over the Brit-ish Isles, which gives rise to a pronounced influencefrom the Baltic Sea (Busuioc et al. 2001). Hellström(2005) found that extreme precipitation in Swedenoccurs predominantly under a cyclonic atmospheric cir-culation pattern, often in combination with southerlywinds. Gustafsson and Rayner (2010) analysed extremeprecipitation events in southern Sweden and noted thatthe Baltic Sea and coastal land areas constitute impor-tant source regions in that context.

2 Methodology

Our procedure for exploring spatio-temporal characteristicsof hourly precipitation data involves three steps. First, cross-similarities are calculated for all pairs of stations. Thereafter,the computed cross-similarities are summarised by comput-ing dominant time lag, maximum cross-similarity, andwindow-maximum similarity. Finally, the spatial distribu-tion of these parameters is visualised using distance-decayplots and colour-coded maps illustrating the cross-similarities for all pairs, given a reference station.

2.1 Cosine similarity

The cosine similarity of two vectors x0(x1, x2, …, xn) andy0(y1, y2, …, yn) is defined as

cossimðx; yÞ ¼Pn

i¼1xiyi

ffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiPn

i¼1x2i

Pn

i¼1y2i

s ð1Þ

Geometrically, this measure can be interpreted as thecosine of the angle between x and y. Furthermore, it canbe noted that, for binary vectors x* and y*, the cosinesimilarity

cossimðx�; y�Þ ¼Pn

i¼1x�i y

�i

ffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiPn

i¼1x�i

Pn

i¼1y�i

s ¼1n

Pn

i¼1x�i y

�i

ffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffi

1n

Pn

i¼1x�i

� �1n

Pn

i¼1y�i

� �s

ð2Þ

is a simple function of three relative frequencies.Now, let (x, y) be a sequence of observations of a bivar-

iate random vector (X, Y) and let x* and y* represent exceed-ances of a threshold u, i.e. set

x�i ¼1; if xi > u

0; if xi � u

(

i ¼ 1; 2; . . . ; n

y�i ¼1; if yi > u

0; if yi � u

(

i ¼ 1; 2; . . . ; n

The cosine similarity of x* and y* can then be regarded asan estimate of

P X > uf g \ Y > uf gð ÞP X > uð ÞP Y > uð Þ ¼

ffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiP X > ujY > uð ÞP Y > ujX > uð Þ

p

i.e. the geometric mean of two conditional probabilitiesrepresenting the coincidence of two events.

A. Burauskaite-Harju et al.

Page 3: Characterising and visualizing spatio-temporal patterns in hourly …rcg.gvc.gu.se/dc/PUBs/Burauskaiteeta12.pdf · 2020-04-16 · Characterising and visualizing spatio-temporal patterns

2.2 Cosine cross-similarity

The cosine similarity measure of binary data can be gener-alised to a cosine cross-similarity function by consideringthe coincidence of time-lagged events. More specifically, weintroduce the function

cossimðx�; y�; dÞ ¼

Pminðn;nþdÞ

i¼maxð1þd;1Þx�i y

�i�d

ffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiPminðn;nþdÞ

i¼maxð1þd;1Þx�i

Pminðn;n�dÞ

i¼maxð1�d;1Þy�i

s : ð3Þ

where d is a time lag that can be positive or negative.

2.3 Maximum cross-similarity and dominant lag

The maximum cosine cross-similarity

maxsimðx�; y�Þ ¼ maxd2 � n�1ð Þ; n�1½ �

cossimðx�; y�; dÞ ð4Þ

of two binary vectors x* and y* is a measure that can beused to describe the strength of the spatial dependence oftwo types of events, regardless of the time lag at which theyoccur. In formula 4, we consider the maximum over allpossible time lags. However, this maximum can also becomputed over a user-defined range covering all time lagsof practical interest.

The dominant lag

dlagðx�; y�Þ ¼ argmaxd2 � n�1ð Þ; n�1½ �

cossimðx�; y�; dÞð Þ ð5Þ

is the time lag for which the cross-similarity reaches itsmaximum. If this lag is not uniquely determined, we setthe dominant lag to the smallest lag for which the maximumcross-similarity is achieved.

2.4 Window-maximum similarity

The concept of window-maximum similarity is introducedto measure the coincidence of events that occur with varyingtime lags. For a given window length h and a pair of binaryvectors x* and y*, we first construct a new pair of binaryvectors x** and y** by setting

x��i ¼ maxðx�i ; x�iþ1; :::; x�iþhÞ; i ¼ 1; 2; :::; n� h

y��i ¼ maxðy�i ; y�iþ1; :::; y�iþhÞ; i ¼ 1; 2; :::; n� h

Then we compute the cosine similarity of x** and y**.This window-maximum similarity is particularly suitable forquantifying the coincidence of events in weather data withhigh temporal resolution, because the synoptic events under

consideration often reach a given pair of stations with vary-ing time lag.

When (x, y) is a sequence of observations of a bivariaterandom vector (X, Y), and x* and y* represent exceedancesof a threshold u, the window-maximum similarity of x* andy* can be regarded as an estimate of

P max Xi; . . . ; Xiþhð Þ > uf g \ max Yi; . . . ; Yiþhð Þ > uf gð ÞffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiP max Xi; . . . ; Xiþhð Þ > uð ÞP max Yi; . . . ; Yiþhð Þ > uð Þp

2.5 Cosine similarity and Pearson’s correlation

The cosine similarity measure and Pearson’s sample corre-lation coefficient

rðx; yÞ ¼Pn

i¼1xi � xð Þ yi � yð Þ

ffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiPn

i¼1xi � xð Þ2 P

n

i¼1yi � yð Þ2

s ð6Þ

for a set of observations of a bivariate random vector (X, Y)are closely related to each other but not identical. Morespecifically, Pearson’s correlation is equal to the cosinesimilarity computed for data normalised to a mean of zero.Moreover, standard calculations show that, for binary obser-vations representing exceedances of a threshold u, Pearson’scorrelation can be regarded as an estimate of

ffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiP X > ujY > uð ÞP Y > ujX > uð Þp � ffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffi

P X > uð ÞP Y > uð Þp

ffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffi1� P X > uð Þð Þ 1� P Y > uð Þð Þp

ð7ÞPearson’s cross-correlation of a sample of a bivariate

random vector (X, Y) is defined as

rðx; y; dÞ ¼

Pminðn;nþdÞ

i¼maxð1þd;1Þxi � xð Þ yi�d � yð Þ

ffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiPminðn;nþdÞ

i¼maxð1þd;1Þxi � xð Þ2 Pminðn;n�dÞ

i¼maxð1�d;1Þyi � yð Þ2

s

ð8ÞFor binary data representing exceedances of a threshold

u, this measure is closely related to the previously definedcosine cross-similarity. More explicitly, the two measuressatisfy an equation of the form

cossimðx�; y�; dÞ ¼ aþ b rðx�; y�; dÞ ð9Þ

where a and b are estimates offfiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiP X > uð ÞP Y > uð Þp

andffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffi1� P X > uð Þð Þ 1� P Y > uð Þð Þp

, respectively. This impliesthat the dominating lag is the same for the cosine cross-

Characterising and visualizing spatio-temporal patterns

Page 4: Characterising and visualizing spatio-temporal patterns in hourly …rcg.gvc.gu.se/dc/PUBs/Burauskaiteeta12.pdf · 2020-04-16 · Characterising and visualizing spatio-temporal patterns

similarity and Pearson’s cross-correlation. Furthermore, thecross-correlation approaches the cross-similarity when thethreshold u tends to infinity.

3 Swedish hourly precipitation data

Hourly precipitation data from 93 meteorological stationsspread throughout Sweden were procured from the SwedishMeteorological and Hydrological Institute (SMHI). Thetime span of the data was from 1996 to 2008, and we limitedour analysis to station series with data covering at least 70%of this period. The coverage was considerably higher than70% for a majority of the selected stations, and the fractionof missing values was less than 10% for 78 stations. We alsonoted that the raw data contained a small number of whatwere most likely unrealistic outliers (>100 mm/h), which weremoved. For a detailed description of the selected data andthe initial adjustments that were performed, see Jeong et al.(2011) and Wern and German (2009).

4 Application to the Swedish data

4.1 Cross-similarity curves

Cross-similarity curves were computed for all pairs of the 93investigated stations. Figure 1 illustrates those curves fortwo pairs representing, respectively, two stations far apart(430 km) and two close to each other (12 km). In particular,it can be noted that the curves have a single peak and thatthe cross-similarity decreases with increasing time lag. Vi-sual inspection of cross-similarity curves for a large set ofstation pairs revealed that almost all pairs exhibited the samecharacteristics, although the peaks tended to flatten out andbecome difficult to identify for pairs located farther apart.Moreover, stations close together had cross-similarity peaksat a time lag close to zero, whereas stations far apartexhibited peaks at time lags up to 24 h.

Figure 1 indicates that, regardless of the spatial distancebetween investigated stations, the cross-similarity normallydecreases with the intensity of the precipitation event. Thiswas confirmed by examining a large number of station pairs.

Fig. 1 Curves of cosine cross-similarity for one pair of sta-tions located 430 km (a) apartand another 12 km apart (b) andfour precipitation thresholds (0,1, 2, and 3 mm/h)

Fig. 2 Window-maximumsimilarity for the same two pairsof stations and thresholds asshown in Fig. 1

A. Burauskaite-Harju et al.

Page 5: Characterising and visualizing spatio-temporal patterns in hourly …rcg.gvc.gu.se/dc/PUBs/Burauskaiteeta12.pdf · 2020-04-16 · Characterising and visualizing spatio-temporal patterns

However, as shown in Fig. 1, the cross-similarity can also besubstantial for high thresholds, provided the stations aresufficiently close to each other.

Inspection of the maximum cross-similarity for stationpairs belonging to a homogeneous network of stationsshowed that the spatial pattern of this parameter is similarto that of the maximum Pearson’s cross-correlation. This is adirect consequence of formula 9, and the fact that the coef-ficients a and b in this formula represent characteristics ofthe marginal distributions. For a heterogeneous network, inwhich the marginal distributions of extreme events may bequite different, the patterns of maximum cross-similarityand cross-correlation can differ substantially.

The window-maximum similarity always increases withincreasing window length. As shown in Fig. 2, there is alsoa tendency for this increase to slow down after a certain time

lag. This was expected, considering that the spatial depen-dence will not increase as much when the window alreadycontains the peak of the cross-similarity curve. In our data-set, the dominant time lag normally varied from a few toabout 15 h and was invariably less than 24 h. This illustratesthe general speed at which rain belts move in Sweden.

4.2 Summaries of cross-similarities to a reference station

To be able to summarise cross-similarity data for many pairsof stations, we reduced the cross-similarity curves to thethree parameters introduced in section 2: the dominant timelag, the maximum cross-similarity, and the window-maximum similarity. Table 1 shows these parameters forthe two pairs of stations illustrated in Fig. 1. Informationabout dominant time lag, maximum cross-similarity, andwindow-maximum similarity for all pairs with a fixed ref-erence station is preferably presented as colour-coded maps.

Figures 3 and 4 showmaps summarizing the spatial depen-dence of precipitation events between one reference station insouthern Sweden, one in northern Sweden, and all otherstations. In both figures, the dominant time lag is generallynegative for stations located south-west of the reference stationand positive for stations north-east of it. Moreover, the dom-inant time lag gradually changes from about –20 to 20 h, thusillustrating the speed and dominant direction of movement ofweather systems in the country. The maps of maximum cross-similarities show the spatial dependence of precipitationevents at the dominant time lag. For both reference stations,

Table 1 Summary of the parameters of the cross-similarity curvescomputed for the two pairs of stations in Fig. 1

Threshold(mm/h)

Dominanttime lag (h)

Maximumcross-similarity

24-h window-maximumsimilarity

430 km 12 km 430 km 12 km 430 km 12 km

0 −7 0 0.25 0.73 0.66 0.89

1 −6 0 0.10 0.59 0.43 0.74

2 −10 0 0.05 0.48 0.26 0.63

3 −10 0 0.04 0.41 0.16 0.57

Fig. 3 Summary mapsrepresenting dominant time lag,maximum cross-similarity, and24-h window-maximum simi-larity for hourly precipitationdata from all pairs of stations inrelation to a reference station inSouthern Sweden (blackcircles). Thresholds of precipi-tation events were set to 0 and2 mm/h, respectively

Characterising and visualizing spatio-temporal patterns

Page 6: Characterising and visualizing spatio-temporal patterns in hourly …rcg.gvc.gu.se/dc/PUBs/Burauskaiteeta12.pdf · 2020-04-16 · Characterising and visualizing spatio-temporal patterns

there is decreasing dependence with increasing separationdistance. For hourly precipitation events exceeding 2 mm,the dependence disappears after a short distance, indicatingthat intense precipitation events occur on a rather local scale.

Repeated visual inspection of a large collection of sum-mary maps of the type shown in Figs. 3 and 4 reveals similarpatterns. This means that it is possible to retrieve the maincharacteristics of spatial dependencies by exploring a smallnumber of summary maps.

4.3 Summary of cross-similarities for a network of stations

In order to explore the spatial dependence of precipitationevents for all pairs of stations in a given network, we plottedthe three cross-similarity parameters introduced in section 2against distance. As expected, the absolute value of the

dominant time lag increased with the distance to the refer-ence station. However, there were also other spatio-temporalpatterns of interest. Figure 5 shows that the markers of theinvestigated pairs of stations are concentrated along astraight line with a slope of 2.5 h/100 km for precipitationamounts exceeding 2 mm, and a slope of 2 h/100 km whenconsidering all precipitation events. In addition, very fewmarkers are above the lines representing slopes of 3.5 and2.5 h/100 km, respectively. The higher slope for largeramounts of precipitation indicates that such events are as-sociated with weather systems such as fronts moving rela-tively slowly across the study area.

Typical relationships between window-maximum simi-larities and the distance between stations are illustrated inFig. 6. It seems that these measures of spatial dependencenormally decrease with distance but remain clearly above

Fig. 4 Summary mapsrepresenting dominant time lag,maximum cross-similarity, and24-h window-maximum simi-larity for hourly precipitationdata from all pairs of stations inrelation to a reference station inNorthern Sweden (blackcircles). Thresholds of precipi-tation events were set to 0 and2 mm/h, respectively

Fig. 5 Relationship betweendominant time lag andseparation distance for allinvestigated pairs of stationsand two precipitation thresholds0 (a) and 2 mm/h (b)

A. Burauskaite-Harju et al.

Page 7: Characterising and visualizing spatio-temporal patterns in hourly …rcg.gvc.gu.se/dc/PUBs/Burauskaiteeta12.pdf · 2020-04-16 · Characterising and visualizing spatio-temporal patterns

zero even for stations that are far apart. Furthermore, it canbe noted that the subsets of markers representing differentwindow lengths are generally well separated. Closer exam-ination of deviating pairs of stations reveals that they repre-sent highland areas, where the topographic conditions candistort overall spatial dependencies.

The maps in Figs. 3 and 4 indicate that the window-maximum similarity is significantly greater than the maxi-mum cross-similarity. Figure 6 shows more generally howthis similarity measure varies with distance between sta-tions, window length, and intensity of the precipitationevent under consideration. The difference in distance-decay relationships between the two precipitation amountscan be explained by the fact that heavy precipitation eventstypically occur in relatively small areas.

4.4 Seasonal patterns of cross-similarity

The similarity measures introduced in section 2 can be usedto reveal seasonal patterns of spatio-temporal dependencies.We investigated the presence of those patterns in summer(April to September, AMJJAS) and winter (October toMarch, ONDJFM) separately.

When the threshold was set to zero, we noted a clearsummer–winter difference in the window-maximum simi-larities and maximum cross-similarities. This is illustrated inFigs. 7 and 8, what show similarities between a referencestation in southern Sweden and all other stations. Further-more, we observed that the dominant time lags increasedmore rapidly with distance in summer than in winter. Closerexamination of the dominant time lags indicates that

Fig. 6 Distance-decayrelationships for window-maximum similarities comput-ed for four window lengths (1,3, 12, and 24 h) and two pre-cipitation thresholds 0 (a) and2 mm/h (b)

Fig. 7 Summary maps ofdominant time lag, maximumcross-similarity, and window-maximum similarity for sum-mer for two precipitationthresholds (0 and 2 mm/h). Thereference station in southernSweden (black circles) is thesame in all maps

Characterising and visualizing spatio-temporal patterns

Page 8: Characterising and visualizing spatio-temporal patterns in hourly …rcg.gvc.gu.se/dc/PUBs/Burauskaiteeta12.pdf · 2020-04-16 · Characterising and visualizing spatio-temporal patterns

Fig. 8 Summary maps ofdominant time lag, maximumcross-similarity, and window-maximum similarity for winterfor two precipitation thresholds(0 and 2 mm/h). The referencestation in southern Sweden(black circles) is the same in allmaps

Fig. 9 Relationship betweendominant time lag and stationseparation distance and twoprecipitation thresholds0 mm/h (left) and2 mm/h (right) in summerand winter

A. Burauskaite-Harju et al.

Page 9: Characterising and visualizing spatio-temporal patterns in hourly …rcg.gvc.gu.se/dc/PUBs/Burauskaiteeta12.pdf · 2020-04-16 · Characterising and visualizing spatio-temporal patterns

movement of weather systems from the south-west towardsthe north-east is the main pattern in both seasons. However,there is a noticeable difference for a group of stations inwestern Sweden. The results obtained for a precipitationthreshold of 2 mm/h are more uncertain due to the relativelysmall number of those events. In particular, our findingsshow that the dominant time lag can vary significantly forstations located far apart.

The plots shown in Fig. 9 show that the dominant lagincreases more rapidly in summer than in winter in relationto distance between stations. This indicates that weathersystems move more slowly and/or take irregular pathwaysin summer. Furthermore, it is worth noting that, regardlessof the threshold, precipitation events exhibit a wider rangeof dominant time lags in winter.

Figure 10 shows that the distance–decay relationships aresimilar in summer and winter. However, the window-maximum similarity is larger in summer for distances ex-ceeding 500 km. In addition, it is apparent that the relation-ships are not as clear in winter, partly due to the smallnumber of heavy precipitation events during this season.

5 Discussion and conclusions

Hourly precipitation records for a network of meteorologicalstations can exhibit very complex spatio-temporal patterns.

Here, we have presented a procedure that enables efficientexploration of the coincidence of precipitation events ofvarying intensity in a network of stations. In particular, wehave shown how spatial dependencies in temporallysynchronised precipitation events can be effectively sum-marised and visualised.

Cosine similarity is a measure of dependence that iswidely used in data mining, especially text mining (Tan etal. 2006), but is applied to a much lesser extent in geo-sciences, where correlograms and variograms are the keyconcepts for spatio-temporal modelling (Diggle and Ribeiro2007; Chen et al. 2007). We chose cosine similarity as thepoint of departure in our study for two reasons: (1) we wereinterested in binary data representing the occurrence of spe-cific weather events; (2) in contrast to Pearson’s correlation,cosine similarity has a simple probabilistic interpretationwhen the underlying data are binary time series. According-ly, the generalization to cross-similarities of events recordedat two stations followed the same line as the generalization ofcorrelation to cross-correlation. By further generalizing thecross-similarity to window-maximum similarity, we wereable to handle the fact that a certain type of event can appearat two stations with a variable time lag.

From a computational point of view, it is easy to producea large number of cross-similarity functions. However, in-terpretation of the results that are obtained can be greatlyfacilitated by reducing the spatio-temporal patterns to a

Fig. 10 Distance-decayrelationships for window-maximum similarities comput-ed for four window lengths(1, 3, 12, and 24 h) and twoprecipitation thresholds0 mm/h (left) and2 mm/h (right) in summer andwinter

Characterising and visualizing spatio-temporal patterns

Page 10: Characterising and visualizing spatio-temporal patterns in hourly …rcg.gvc.gu.se/dc/PUBs/Burauskaiteeta12.pdf · 2020-04-16 · Characterising and visualizing spatio-temporal patterns

small number of characteristics for each station pair. There-fore, we introduced a three-parameter summary comprisingdominant time lag, maximum cross-similarity, and window-maximum similarity. Furthermore, we proposed colour-coded maps and distance-decay plots to visualise the infor-mation in these summaries. The colour-coded maps areparticularly useful to investigate anisotropy and how simi-larities vary in relation to the location of the referencestation. The distance–decay plots offer the benefit of pro-viding a comprehensive overview of spatial dependenciesfor the entire network.

By applying our technique to hourly precipitation recordsacquired from a network of stations in Sweden, we foundthat this procedure can reveal important characteristics ofthe spatial and temporal coincidence of precipitation events.The colour-coded maps of dominant time lags in Figs. 3 and4 show how precipitation-generating weather systems nor-mally move across Sweden, driven primarily by the prevail-ing south-westerly winds. It supports results of Hellström(2005). In addition, the distance-decay plots in Fig. 6 revealthat the spatial dependence of heavy precipitation eventswas significant up to a distance of about 200 km betweenstations, although it was considerably weaker than that ofprecipitation events in general. Our examination of sea-sonal patterns in the dependence structure demonstratedthat the methods we have proposed can enable a rapid com-parison of spatio-temporal features of different subsets ofbinary data.

It might be of interest to construct confidence intervalsfor cosine cross-similarity and summarizing parameters, andalso to define a decorrelation distance between stations. Ingeneral, it would be possible to estimate such confidenceintervals by using a bootstrap method. However, there is nostraightforward approach indicating how hourly precipita-tion data should be treated in order to ensure independentidentically distributed values, because, when consideringthe Swedish climate, such data show some regional differ-ences and also exhibit distinct diurnal and seasonal cycles,as well as autocorrelation.

This study was devoted solely to observational precipita-tion data. However, the methods can also be used to supportdevelopment, improvement, and validation of high-resolution precipitation models (Haberlandt et al. 2008), aswell as regional climate models such as the RCA3 run bythe Rossby Centre at the SMHI. In its simplest form, themodelling support may be confined to visual comparison ofthree-parameter summaries of observational data and modeloutputs. However, if an objective function is used to opti-mise inter-site dependence in model outputs (Haberlandt etal. 2008), parameters like window-maximum similarity canbe incorporated into that function.

More generally, cosine similarity and cosine cross-similarity of precipitation events with 0 mm/h threshold

measure inter-site dependence of precipitation occurrenceand might represent an alternative to the correlation ofGaussian process often used in simulation of precipitationoccurrence in multi-site rainfall generators. Such correlationmatrices need adjustments to represent observed depen-dence in rainfall fields accurately (Wilks 1998; Serinaldi2009; Brissette et al. 2007). A further study of linear rela-tionship between cosine similarity and Pearson’s correlation(Eq. 9) might help to define a simple form of these adjust-ments. Altogether, there would be a further need to analysehere provided dependence measures in the context of multi-site precipitation generation.

Acknowledgements The authors are very grateful to the SwedishMeteorological and Hydrological Institute (SMHI) for providing theprecipitation data, to Colin Jones at the Rossby Centre for valuablecomments and discussions, and to the Swedish Research Council (VR),the Gothenburg Atmospheric Science Centre (GAC), and FORMAS(grant #2007-1048-8700∗51) for financial support to Deliang Chenand Alexander Walther.

References

Achberger C, Chen D, Alexandersson H (2006) The surface winds ofSweden during 1999–2000. Int J Climatol 26:159–178

Aghakouchak A, Ciach G, Habib E (2010) Estimation of tail depen-dence coefficient in rainfall accumulation fields. Adv WaterResour 33(9):1142–1149. doi:10.1016/j.advwatres.2010.07.003

Baigorria GA, Jones JW, O'Brien JJ (2007) Understanding rainfallspatial variability in southeast USA at different timescales. Int JClimatol 27(6):749–760. doi:10.1002/joc.1435

Barbaliscia F, Ravaioli G, Paraboni A (1992) Characteristics of thespatial statistical dependence of rainfall rate over large areas.IEEE Trans Antenn Propag 40(1):8–12

Brissette FP, Khalili M, Leconte R (2007) Efficient stochastic genera-tion of multisite synthetic precipitation data. J Hydrol 345:121–133

Brommundt J, Bárdossy A (2007) Spatial correlation of radar andgauge precipitation data in high temporal resolution. Adv Geosci10:103–109. doi:10.5194/adgeo-10-103-2007

Buishand TA, Brandsma T (2001) Multisite simulation of daily pre-cipitation and temperature in the Rhine basin by nearest-neighborresampling. Water Resour Res 37(11):2761–2776

Busuioc A, Chen D, Hellström C (2001) Temporal and spatial vari-ability of precipitation in Sweden and its link with the large scaleatmospheric circulation. Tellus 53A(3):348–367

Chen D, Gong L, Xu C, Halldin S (2007) A high-resolution, griddeddataset for monthly temperature normals (1971–2000) in Sweden.Geografiska Annaler 89A(4):249–261

Diggle P, Ribeiro PJ (2007) Model-based geostatistics. Springer, NewYork, 228 p

Garcia P, Zambudio N, Benarroch A (2002) Joint Rainfall Rate statis-tics for Pairs of Sites in Spanish Regions. COST action 280“Propagation Impairment Mitigation for Millimetre Wave RadioSystems”, 1st International Workshop

Gong DY, Ho CH, Chen D, Qian Y, Choi YS, Kim J (2007) Weeklycycle of aerosol-meteorology interaction over China. J GeophysRes 112(D22202). doi:10.1029/2007JD008888

Gunst RF (1995) Estimating spatial correlations from spatial–temporalmeteorological data. J Clim 8(10):2454–2470

A. Burauskaite-Harju et al.

Page 11: Characterising and visualizing spatio-temporal patterns in hourly …rcg.gvc.gu.se/dc/PUBs/Burauskaiteeta12.pdf · 2020-04-16 · Characterising and visualizing spatio-temporal patterns

Gustafsson M, Rayner D, Chen D (2010) Extreme rainfall events insouthern Sweden: where does the moisture come from? Tellus62A:605–616

Haberlandt U, Ebner von Eschenbach AD, Buchwald I (2008) A space-time hybrid hourly rainfall model for derived flood frequencyanalysis. Hydrol Earth Syst Sci 12(6):1353–1367. doi:10.5194/hess-12-1353-2008

Hellström C (2005) Atmospheric conditions during extreme and non-extreme precipitation events in Sweden. Int J Climatol 25:631–648

Hurrel JW, Deser C (2009) North Atlantic climate variability: the roleof the North Atlantic Oscillation. J Mar Syst 78:28–41

Hurrel JW, Kushnir Y, Ottersen G, Visbeck M (2003) An overview ofthe North Atlantic oscillation. In: The North Atlantic oscillation:climate significance and environmental impact. GeophysicalMonograph Series, 134: 1–35

Jeong J-H, Walther A, Nikulin G, Chen D, Jones C (2011) Diurnalcycle of precipitation amount and frequency in Sweden: observa-tion versus model simulation. Tellus A. doi:10.1111/j.1600-0870.2011.00517.x

Johansson B, Chen D (2005) Estimation of areal precipitation forrunoff modelling using wind data: a case study in Sweden. ClimRes 29:53–61

Jones PD, Osborn TJ, Briffa KR (1997) Estimating sampling errors inlarge-scale temperature averages. J Clim 10(10):2548–2568

Linderson ML (2003) Spatial distribution of meso-scale precipitationin Scandinavia, Southern Sweden. Geografiska Annaler, Series A85(2):183–196

Osborn TJ, Hulme M (1997) Development of a relationship betweenstation and grid-box rainday frequencies for climate model eval-uation. J Clim 10(8):1885–1908

Robeson SM, Shein KA (1997) Spatial coherence and decay of windspeed and power in the north-central United States. Phys Geogr18(6):479–495

Serinaldi F (2008) Analysis of inter-gauge dependence by Kendall's tau(K), upper tail dependence coefficient, and 2-copulas with applica-tion to rainfall fields. Stochastic Environmental Research and RiskAssessment 22(6):671–688. doi:10.1007/s00477-007-0176-4

Serinaldi F (2009) A multisite daily rainfall generator driven by bivar-iate copula-based mixed distributions. J Geophys Res 114.doi:10.1029/2008JD011258

SMHI (2009) Sveriges klimat (The climate of Sweden). In Swedish.Swedish Meteorological and Hydrological Institute. Available athttp://www.smhi.se/kunskapsbanken/klimat/sveriges-klimat-1.6867). Accessed 19 September 2011

Tan PN, Steinbach M, Kumar V (2006) Introduction to data mining.Addison-Wesley, Reading, 769 p

Uvo CB (2003) Analysis and regionalization of northern Europeanwinter precipitation based on its relationship with the NorthAtlantic Oscillation. Int J Climatol 23:1185–1194

Vedin H, Raab R (1995) Climate, Lakes and Rivers. National Atlas ofSweden. SNA Publishing, Stockholm, p 175

Wern L, German J (2009) Korttidsnederbörd i Sverige 1995–2008.Meteorologi, 139/2009. SMHI

Wilks DS (1998) Multisite generalization of a daily stochastic precip-itation generation model. J Hydrol 210(1–4):178–191

Yang C, Chandler RE, Isham VS, Wheater HS (2005) Spatial-temporalrainfall simulation using generalized linear models. Water ResourRes 41(11). doi:W1141510.1029/2004wr003739

Zheng XG, Katz RW (2008) Simulation of spatial dependence in dailyrainfall using multisite generators. Water Resour Res 44(9). doi:W0940310.1029/2007wr006399

Characterising and visualizing spatio-temporal patterns