Estimating spatially varying event rates with a change ...

11
Estimating spatially varying event rates with a change point using Bayesian statistics: Application to induced seismicity Abhineet Gupta , Jack W. Baker Civil & Environmental Engineering Department, Stanford University, Stanford, CA 94305, USA article info Article history: Received 9 March 2016 Received in revised form 9 November 2016 Accepted 16 November 2016 Available online 2 December 2016 Keywords: Change point method Bayesian inference Induced seismicity Spatio-temporal process abstract We describe a model to estimate event rates of a non-homogeneous spatio-temporal Poisson process. A Bayesian change point model is described to detect changes in temporal rates. The model is used to esti- mate whether a change in event rates occurred for a process at a given location, the time of change, and the event rates before and after the change. To estimate spatially varying rates, the space is divided into a grid and event rates are estimated using the change point model at each grid point. The spatial smoothing parameter for rate estimation is optimized using a likelihood comparison approach. An example is pro- vided for earthquake occurrence in Oklahoma, where induced seismicity has caused a change in the fre- quency of earthquakes in some parts of the state. Seismicity rates estimated using this model are critical components for hazard assessment, which is used to estimate seismic risk to structures. Additionally, the time of change in seismicity can be used as a decision support tool by operators or regulators of activities that affect seismicity. Ó 2016 Elsevier Ltd. All rights reserved. 1. Introduction In this paper, we estimate the rates of a non-homogeneous spatio-temporal Poisson process. The rates vary spatially with the possibility of an independent temporal change at any point in space. We use a Bayesian estimation approach and describe a change point model to detect temporal changes. We describe a likelihood comparison methodology to estimate spatially-varying event rates using the change point model. The results from the model are regions of estimated change, times of change, and spa- tially varying event rates. The model is demonstrated through an application to induced seismicity in Oklahoma. Similar approaches for change detection have been used previ- ously, for example, a Bayesian model was developed for Poisson processes to assess changes in intervals between coal-mining dis- asters [1]. A model was proposed to detect early changes in seis- micity rates based on earthquake declustering and hypothesis testing [2]. While there is some precedence, the problem described in this paper is different than the previous ones because the event rates vary spatially in addition to the possibility of a temporal change. Estimation of these spatially varying rates requires an appropriate rate smoothing procedure, which is also described here. The motivation for this paper is the significant increase in seis- micity that has been recently observed in the Central and Eastern US (CEUS) [3]. For example in 2014 and 2015, more earthquakes were observed in Oklahoma than in California. There is a possibility that this increased seismicity is a result of underground wastewa- ter injection [e.g., 3–5]. Seismicity generated as a result of human activities is referred to as induced or triggered seismicity. Fig. 1 shows the cumulative number of earthquakes with magnitude P3 since 1974 for four quadrants of Oklahoma. There is a signifi- cant increase in seismicity rate starting around 2008, though the date and magnitude of rate increase varies among the different regions. Hence, the times of change and the seismicity rates need to be estimated individually for this spatio-temporal process. There is a need to understand and manage the induced seismicity hazard and risk [6,7]. The increased seismicity due to anthropogenic processes affects the safety of buildings and infras- tructure, especially since seismic loading has historically not been the predominant design force in most CEUS regions. This makes the seismicity rate a critical component for hazard assessment [8]. The work in this paper will aid in effective risk assessment through better future prediction of earthquakes in a local region using the estimated spatially-varying seismicity rates. These rates would aid in development of hazard maps, which are commonly used to estimate the seismic loading during the structural design process. http://dx.doi.org/10.1016/j.strusafe.2016.11.002 0167-4730/Ó 2016 Elsevier Ltd. All rights reserved. Corresponding author. E-mail addresses: [email protected] (A. Gupta), [email protected] (J.W. Baker). Structural Safety 65 (2017) 1–11 Contents lists available at ScienceDirect Structural Safety journal homepage: www.elsevier.com/locate/strusafe

Transcript of Estimating spatially varying event rates with a change ...

Page 1: Estimating spatially varying event rates with a change ...

Structural Safety 65 (2017) 1–11

Contents lists available at ScienceDirect

Structural Safety

journal homepage: www.elsevier .com/locate /s t rusafe

Estimating spatially varying event rates with a change point usingBayesian statistics: Application to induced seismicity

http://dx.doi.org/10.1016/j.strusafe.2016.11.0020167-4730/� 2016 Elsevier Ltd. All rights reserved.

⇑ Corresponding author.E-mail addresses: [email protected] (A. Gupta), [email protected]

(J.W. Baker).

Abhineet Gupta ⇑, Jack W. BakerCivil & Environmental Engineering Department, Stanford University, Stanford, CA 94305, USA

a r t i c l e i n f o

Article history:Received 9 March 2016Received in revised form 9 November 2016Accepted 16 November 2016Available online 2 December 2016

Keywords:Change point methodBayesian inferenceInduced seismicitySpatio-temporal process

a b s t r a c t

We describe a model to estimate event rates of a non-homogeneous spatio-temporal Poisson process. ABayesian change point model is described to detect changes in temporal rates. The model is used to esti-mate whether a change in event rates occurred for a process at a given location, the time of change, andthe event rates before and after the change. To estimate spatially varying rates, the space is divided into agrid and event rates are estimated using the change point model at each grid point. The spatial smoothingparameter for rate estimation is optimized using a likelihood comparison approach. An example is pro-vided for earthquake occurrence in Oklahoma, where induced seismicity has caused a change in the fre-quency of earthquakes in some parts of the state. Seismicity rates estimated using this model are criticalcomponents for hazard assessment, which is used to estimate seismic risk to structures. Additionally, thetime of change in seismicity can be used as a decision support tool by operators or regulators of activitiesthat affect seismicity.

� 2016 Elsevier Ltd. All rights reserved.

1. Introduction

In this paper, we estimate the rates of a non-homogeneousspatio-temporal Poisson process. The rates vary spatially with thepossibility of an independent temporal change at any point inspace. We use a Bayesian estimation approach and describe achange point model to detect temporal changes. We describe alikelihood comparison methodology to estimate spatially-varyingevent rates using the change point model. The results from themodel are regions of estimated change, times of change, and spa-tially varying event rates. The model is demonstrated through anapplication to induced seismicity in Oklahoma.

Similar approaches for change detection have been used previ-ously, for example, a Bayesian model was developed for Poissonprocesses to assess changes in intervals between coal-mining dis-asters [1]. A model was proposed to detect early changes in seis-micity rates based on earthquake declustering and hypothesistesting [2]. While there is some precedence, the problem describedin this paper is different than the previous ones because the eventrates vary spatially in addition to the possibility of a temporalchange. Estimation of these spatially varying rates requires an

appropriate rate smoothing procedure, which is also describedhere.

The motivation for this paper is the significant increase in seis-micity that has been recently observed in the Central and EasternUS (CEUS) [3]. For example in 2014 and 2015, more earthquakeswere observed in Oklahoma than in California. There is a possibilitythat this increased seismicity is a result of underground wastewa-ter injection [e.g., 3–5]. Seismicity generated as a result of humanactivities is referred to as induced or triggered seismicity. Fig. 1shows the cumulative number of earthquakes with magnitudeP3 since 1974 for four quadrants of Oklahoma. There is a signifi-cant increase in seismicity rate starting around 2008, though thedate and magnitude of rate increase varies among the differentregions. Hence, the times of change and the seismicity rates needto be estimated individually for this spatio-temporal process.

There is a need to understand and manage the inducedseismicity hazard and risk [6,7]. The increased seismicity due toanthropogenic processes affects the safety of buildings and infras-tructure, especially since seismic loading has historically not beenthe predominant design force in most CEUS regions. This makes theseismicity rate a critical component for hazard assessment [8]. Thework in this paper will aid in effective risk assessment throughbetter future prediction of earthquakes in a local region using theestimated spatially-varying seismicity rates. These rates wouldaid in development of hazard maps, which are commonly used toestimate the seismic loading during the structural design process.

Page 2: Estimating spatially varying event rates with a change ...

1975 1980 1985 1990 1995 2000 2005 2010 20150

50

100

150

200

1500

1550

1600

−100

−100°

−98

−98°

−96

−96

3434

3636

−100° −98°

4°4

−100 −98

66°6

−96°

34°

−96°6

36°

NW

Date

Num

ber o

f ear

thqu

akes

NE

SE

SWSW SE

NENW

Fig. 1. Cumulative number of earthquakes in four quadrants of Oklahoma with magnitudeP3 from 1974 through Dec 31, 2015. The earthquakes post 2008 are shown in pinkon the map, and the size of the circles is proportional to the earthquake magnitude. We have omitted the western panhandle of Oklahoma in this and all subsequent maps,since no seismicity increase has been observed in this region, and to draw focus to the remainder of the state.

2 A. Gupta, J.W. Baker / Structural Safety 65 (2017) 1–11

Additionally, identifying changes in seismicity rates can be used asa decision support tool by stakeholders and regulators to monitorand manage the seismic impacts of human activities [2].

The structure of the paper is divided into the description of themodel and its application on induced seismicity. In Section 2, wedescribe a Bayesian change point model that is used to identifychanges in event rates, and to estimate the event rates beforeand after the change. In Section 3, we present a methodology toestimate event rates for a spatio-temporal non-homogeneous Pois-son process. In Section 4, we apply this methodology to estimatespatially-varying earthquake rates in Oklahoma. In Section 5, weaddress some model limitations with examples from the applica-tion in Oklahoma.

2. Bayesian model for change point detection

In this section, we describe a Bayesian change point model todetect changes in event rates for a non-homogeneous Poisson pro-cess with one change point. We also describe the algorithmicimplementation of the model.

2.1. Model

A Bayesian change point model to detect a change in event ratesis described by [1,9]. This model uses time between events todetect a change in rates. Given a dataset of inter-event times, theBayes factor [10] is calculated to indicate whether a change inevent rates occurred. The Bayes factor is defined here as the ratioof the likelihood of a model with no change to the likelihood of achange point model, given the observed data.

B01ðtÞ ¼ LðH0jtÞLðH1jtÞ ð1Þ

where B01ðtÞ is the Bayes factor, t is a vector of inter-event times,and H0 and H1 represent the models with no change and a change,respectively. LðHjtÞ defines the likelihood of model H given someobserved data t. The two models, H0 and H1, are described belowand the final formulation of the equation to calculate the Bayes fac-tor is given later in Eq. (21).

Values smaller than one for the Bayes factor indicate that themodel with change is favored over the model with no change.The threshold value of the Bayes factor that indicates strong pref-erence for one or the other model can be selected based on therequired degree of confidence, but typically values less than 0.01or larger than 100 are used to favor one or the other model. If achange is detected in the data, the time of change and event ratesbefore and after the change are subsequently calculated.

For a sequence of events in a non-homogeneous Poisson processwith a single change, the unknown variables of interest are thetime of change s, the event rate before the change k1, and the eventrate after the change k2.

kðsÞ ¼ k1; 0 6 s 6 sk2; s < s 6 T

�ð2Þ

where the observation period for events is defined as ½0; T�. Assumethat the zeroth event in the event sequence occurs at time 0 and thenth event occurs at time T. The inter-event times are defined as

t ¼ t1; t2; . . . ; tnf g s:t:Xi

ti ¼ T ð3Þ

where ti denotes the time between occurrences of the i� 1th andthe ith events.

Since the events follow a Poisson distribution with differentrates before and after the change, the inter-event times are expo-nentially distributed and can be expressed as

f XkðsÞðxÞ ¼ kðsÞe�kðsÞ x ð4Þ

where f XðxÞ denotes a probability distribution function of X; kðsÞ isthe parameter for the distribution (the event rate), and X is the ran-dom variable (the inter-event time).

For the Bayesian framework, conjugate priors are defined for kjas gamma distributions with parameters kj and hj [11]. Then theprior probability distribution of the rates pðkjÞ is written as

pðkjÞ / kkj�1j e�kj=hj ð5Þ

where / is the proportionality symbol.

Page 3: Estimating spatially varying event rates with a change ...

A. Gupta, J.W. Baker / Structural Safety 65 (2017) 1–11 3

The time of change s is assumed to be equally likely at any timeduring the observation period. Hence, its prior pðsÞ is assumed tobe uniformly distributed.

pðsÞ ¼ 1T; 0 6 s 6 T ð6Þ

The likelihood function L for the unknown parametersfs; k1; k2g given the inter-event times t is written as the productof the probability distributions for events following the Poissondistribution, and occurring before and after time s.

Lðs; k1; k2jtÞ ¼ kNðsÞ1 e�k1 skNðTÞ�NðsÞ2 e�k2 ðT�sÞ ð7Þ

where NðtÞ represents the number of events between ½0; t�. Assumethat the time of change s, event rate before change k1, and eventrate after change k2, are mutually independent. Then the posteriordensity pðs; k1; k2jtÞ for all the unknown parameters is calculated as

pðs; k1; k2jtÞ / Lðs; k1; k2jtÞpðk1; k2; sÞ¼ Lðs; k1; k2jtÞpðk1Þpðk2ÞpðsÞ ð8Þ

The marginal distributions for each of s; k1, and k2 are obtainedby integrating the above posterior density over the remaining twovariables. The marginal posterior distribution of s is calculated as

pðsjtÞ/Z 1

0

Z 1

0pðk1;k2;sjtÞdk1dk2

¼pðsÞZ 1

0

Z 1

0kNðsÞþk1�11 e

�k1 sþ 1h1

� �dk1

!

�kNðTÞ�NðsÞþk2�12 e

�k2 T�sþ 1h2

� �dk2 ¼1

T:Cðr1ðsÞÞCðr2ðsÞÞS1ðsÞr1ðsÞS2ðsÞr2ðsÞ

ð9Þ

where CðxÞ is the gamma function, and

r1ðsÞ ¼ NðsÞ þ k1 S1ðsÞ ¼ sþ 1h1

r2ðsÞ ¼ NðTÞ � NðsÞ þ k2 S2ðsÞ ¼ T � sþ 1h2

ð10Þ

Eq. (9) is written in log space for implementation of the algo-rithm, described in Section 2.2.

logpðsjtÞ / � log T þ log Cðr1ðsÞÞð Þ þ log Cðr2ðsÞÞð Þ� r1ðsÞ log S1ðsÞð Þ � r2ðsÞ log S2ðsÞð Þ ð11Þ

Similarly, the marginal distribution of k1 is calculated as shownbelow. A closed form solution for integration over s does not exist.Hence, to evaluate the probability distribution, the time range isdiscretized over a uniform Dt and summed over to approximatethe marginal distribution.

pðk1jtÞ /Z T

0

Z 1

0pðk1; k2; sjtÞdk2ds

¼Z T

0

Z 1

0kr2ðsÞ�12 e�k2S2ðsÞdk2

� �pðsÞkr1ðsÞ�1

1 e�k1S1ðsÞds

�XTs¼0

1Tkr1ðsÞ�11 e�k1S1ðsÞCðr2ðsÞÞS2ðsÞr2ðsÞ ð12Þ

This equation is also converted to log domain for algorithmicimplementation.

logpðk1jtÞ / logXTs¼0

ez1 !

ð13Þ

where

z1 ¼ � log T þ ðr1ðsÞ � 1Þ log k1 � k1S1ðsÞ þ log Cðr2ðsÞÞ½ �þ r2ðsÞ log S2ðsÞð Þ ð14Þ

The marginal distribution of k2 is calculated and approximatedsimilarly as

pðk2jtÞ /XTs¼0

1Tkr2ðsÞ�12 e�k2 S2ðsÞCðr1ðsÞÞS1ðsÞr1ðsÞ ð15Þ

and

logpðk2jtÞ / logXTs¼0

ez2 !

ð16Þ

where

z2 ¼ � log T þ ðr2ðsÞ � 1Þ log k2 � k2S2ðsÞ þ log Cðr1ðsÞÞ½ �þ r1ðsÞ log S1ðsÞð Þ ð17Þ

We now describe the constant rate model. For a sequence ofevents in a homogeneous Poisson process with no change, theunknown variable of interest is the event rate k0. Assume thisevent rate has a gamma distribution prior with parameters k0and h0, similar to the prior for parameters k1 and k2. Then the pos-terior distribution of the event rate pðk0jtÞ follows the gamma dis-tribution with the following parameters [11]

kposterior ¼ k0 þ NðTÞ hposterior ¼ 1h0þ T

� ��1 ð18Þ

With the above results, the Bayes factor can be calculated. Thelikelihood of the change point model, H1, given the observed eventsis obtained by integrating the posterior distribution given in Eq. (8)over s; k1, and k2.

LðH1jtÞ ¼Z 1

0

Z 1

0

Z T

0pðs; k1; k2jtÞdsdk1dk2 ð19Þ

The likelihood of a constant rate model H0 given the observedevents is similarly obtained as

LðH0jtÞ /Z 1

0kk0þNðTÞ�10 e�k0ð1=h0þTÞdk0 ð20Þ

Eqs. (19) and (20) each require a proportionality factor, and thisfactor is different for the two equations. Hence, in the calculationfor the Bayes factor, we multiply the ratio of likelihoods with aconstant term cðTÞ, to correctly convert the proportionality in thelikelihood calculations. When pðsÞ ¼ 1=T , and k1 ¼ k2; cðTÞ can becomputed by equating the Bayes factor to 1 for a boundary condi-tion of a single event occurring half-way through the observationperiod [1]. If the value of parameters for the gamma conjugate pri-ors are kj ¼ 0:5 and hj ! 1forj ¼ 0;1;2, then the Bayes factor canbe written as [1]

B01ðtÞ ¼ 4ffiffiffiffip

pT�nCðnþ 1=2ÞXT

s¼0

Cðr1ðsÞÞCðr2ðsÞÞS1ðsÞ�r1ðsÞS2ðsÞ�r2ðsÞð21Þ

2.2. Algorithm

Since all the unknown variables in the model described above,fs; k1; k2g, are continuous, they are discretized for algorithmicimplementation. Additionally, the algorithm is susceptible to arith-metic overflow (i.e., the condition when a calculation produces aresult that is greater in magnitude than which can be representedin computer memory), for instance when computing CðxÞ for largex (e.g., Cð200Þ ¼ 3:94� 10372). To prevent overflow, the computa-tions are performed in the log domain and reverted back at theend. We represent in our algorithm, the largest finite floating-point number on a computer as REAL_MAX (= 1.7977 � 10308 on

Page 4: Estimating spatially varying event rates with a change ...

(a)

(b)

Grid points

Grid cell iwith area ai

Events

Smaller circles missLarger circles overlap

4 A. Gupta, J.W. Baker / Structural Safety 65 (2017) 1–11

a 64-bit machine). Algorithms 1 and 2 describe how to obtain theposterior densities of s and k, respectively.

Algorithm 1. Estimating the distribution of s using thechange point model

1: Discretize s uniformly into xifori ¼ 1; . . . ; p over its domain½0; T�.

2. At each xi, calculate log probi ¼ logðpðxijtÞÞ, using Eq. (11).3: To exponentiate the log probability, find the smallest scale

such thatP

ielog probi�scale 6 REAL_MAX. The scale ensures

that the final result can be represented in the computermemory.

4: Calculate probi ¼ elog probi�scale.

5: Normalize pdfi ¼ probiPprobi�ðxiþ1�xiÞ

to obtain the probability

density function value at each xi.

events in a grid cellresulting in rate smoothing

Fig. 2. a) A two-dimensional space divided into a uniform grid, showing grid pointsand grid cells, and b) circular regions of different sizes showing the influence ofradius r on rate smoothing.

3. Assessing spatially varying event rates with a change point

In this section, we present a methodology to estimate spatiallyvarying event rates using the model from the previous section. Wealso describe a likelihood comparison method based on theapproach described by [12], to optimize the model’s spatial averag-ing parameter.

Algorithm 2. Estimating the distribution of k using the changepoint model

1: Discretize the variable of interest, k1 or k2, intoxifori ¼ 1; . . . ; q over its domain. Since the domain is ½0;1Þ,select a large enough range such that probability ofobserving a rate less than the smallest value, and greaterthan the largest value, is negligible.

2: Discretize s uniformly into sjforj ¼ 1; . . . ; p over its domain½0; T�.

3: At each xi, calculate zij for all j ¼ 1; . . . ; p, using Eq. (14), orEq. (17).

4: At each xi, calculate sumi ¼P

jezij�scalei using the smallest

scalei such that sumi 6REAL_MAX. The scalei ensures thatthe final result can be represented in the computermemory.

5: At each xi, calculate log probi ¼ logðsumiÞ, using Eq. (13), orEq. (16).

6: Follow steps 3 through 5 in Algorithm 1 to obtain theprobability density function value at each xi.

3.1. Estimating event rates over a spatial grid

Given a two dimensional space where discrete event sourcescannot be identified, we divide the region into a uniform grid,and calculate event rates at each grid point (see Fig. 2). The spacingof the grid can be determined using prior knowledge about thephysics of the process under consideration, or optimized usingthe approach described in Section 3.2.

At each grid point, the change point model of Section 2 is imple-mented on the events observed in a circular region of radius raround the grid point. For the events observed in this circularregion, the inter-event times are calculated to be used as inputfor the model. If a change is not detected using the Bayes factorfor the sequence of events, then the event rate at the grid pointis estimated using the no-change model. If a change is detected,

then the post-change rate is used as the current event rate. Thisestimated rate is divided by pr2 to compute the event rate per unitarea. Based on the properties of the event process and the applica-tion of the rates, the post-change rate can be selected as the poste-rior mean, mode or median of the posterior distribution, or thecomplete distribution can be selected.

The size of the circular region affects the smoothing of thespatially-varying rates. As shown in Fig. 2, if the radius r of the cir-cular region is too small compared to the grid size, then someobserved events in the space will not be considered in estimatingthe event rates. When r is large, there will be some events that willbe included more than once in the rate calculations due to overlap-ping circular regions at adjacent grid points. It is not possible toweigh the events according to distance, since the above changepoint analysis uses inter-event times between events. However,this multi-counting of events does not artificially increase theevent rates over the entire space since the rates estimated at thegrid points are normalized to a rate per unit area, and are onlyapplicable over the corresponding grid cells. A larger value of rincreases the number of common events between adjacent gridpoints, and thus has the effect of smoothing the estimated rates.The desired smoothing of the event rates is difficult to determinea priori, so we use a likelihood comparison methodology describedbelow to select r.

3.2. Optimizing the parameters of the model

In this section, we determine the radius r described above bymaximizing the likelihood of the model associated with observingfuture events, for varying r. We use a modified version of the like-lihood comparison methodology described by [12].

We first formulate the likelihood of the model. Let there be mgrid cells, each associated with a grid point (a grid cell is the rectan-gle formed bymidpoints of grid intersections associated with a gridpoint, as shown in Fig. 2). Let the event rate per unit area per unittime at grid point i be represented by kifori ¼ 1; . . . ;m. Let Cf besome future catalog of events with a catalog duration tf . Let thenumber of events observed in the future catalog within grid cell ibe ni. Let the area of grid cell i be given by ai. Then using the fact thatevents belong to a Poisson process, the likelihood Li of the modelfor grid cell i associated with events ni is computed as -

Page 5: Estimating spatially varying event rates with a change ...

A. Gupta, J.W. Baker / Structural Safety 65 (2017) 1–11 5

Li ¼ ðkiaitf Þnie�kiaitf

ni!ð22Þ

The likelihood over the entire space L is calculated by multi-plying the likelihood for all m grid cells.

L ¼Ymi¼1

Li ¼Ymi¼1

ðkiaitf Þnie�kiaitf

ni!ð23Þ

We compute the log-likelihood ‘ by taking the log.

‘ ¼Xmi¼1

ni logðkiaitf Þ � tfXmi¼1

kiai þ cf ð24Þ

where cf ¼ �Pi logni! is a constant term which depends on thefuture catalog, but not on the event rates. This term can be disre-garded when comparing the log-likelihoods of two models for thesame future catalog.

If there are two different models, M1 and M2 with correspond-ing log-likelihoods ‘1 and ‘2 respectively, they are compared bycalculating the probability gain G12 per event. If G12 > 1, it impliesthat M1 has a higher likelihood associated with the events in Cf ,and that M1 is a better estimator of events the larger the gain is.

G12 ¼ exp‘1 � ‘2X

i

ni

0BB@

1CCA ð25Þ

This probability gain calculation is similar to that of [12], exceptthat we do not normalize the event rates in a grid cell with thetotal number of events in the future catalog. Normalization ofevent rates is useful when examining the spatial distribution ofevents, and assuming that the cumulative event rate remains con-stant over time. When implementing the change point analysishowever, we expect that event rates may change for some regionsin the space. Hence, our calculations omit the normalization step.

The likelihood comparison approach will be used for compar-ison of models with different radii of the circular region. However,this approach is versatile and can be used to compare the perfor-mance of any two models that estimate rates over a spatial grid,for a given future catalog.

4. Application in Oklahoma

In this section, we implement the above calculations to detectand quantify changes in seismicity rates in Oklahoma due toinduced seismicity. We first consider a single location, then applythe model throughout the state, and finally optimize the spatialsmoothing parameter.

We use the Oklahoma Geological Survey earthquake catalog, formagnitudes M P 3 earthquakes from January 01, 1974 to Decem-ber 31, 2015 [13]. Earthquakes are typically assumed to behaveas a Poisson process when an earthquake catalog is declustered[e.g., 14,15]. We decluster the catalog using the Reasenbergapproach described by [16], using parameters developed for Cali-fornia since these parameters have not been determined for Okla-homa. The minimum magnitude for catalog completeness is set tomagnitude 3. The original catalog contains 1708M P 3 events, and1051 mainshocks remain after declustering. We note that there hasnot been a conclusive study identifying the best declusteringmethodology to use for regions of induced seismicity. Since declus-tering is done independently of the model implementation, otherdeclustering techniques like Gardner-Knopoff [14] may be utilizedwhile maintaining the model framework described in this paper.

4.1. Application at a single location

We first implement the Bayesian change point analysisdescribed in Section 2 for a site at 96.7�W and 35.6�N. We considera circular region of radius r ¼ 25 km around this site. The radiussize is optimized later. The earthquakes observed in this regionsince 1974 are shown in Fig. 3. This region includes the largestrecently recorded earthquake in Oklahoma of magnitude 5.6 atPrague on November 06, 2011. From the figure, it is visually appar-ent that a change in seismicity rate occurred around 2009, but wewould like to identify this change using our model.

We first determine whether the inter-event times betweenearthquakes support a change point model. We use the followinghyper-parameter values for the priors: kj ¼ 0:5 andhj ! 1forj ¼ 0;1;2. For our application, we reduce the Bayes factor

threshold to a value of less than 1� 10�3 to require a strong pref-erence for the change model before inferring that a changeoccurred. This is done for numerical stability, and to minimize acci-dental change detection when running multiple analyses at differ-ent grid points. A Bayes factor of 7� 10�32 is computed for thisdata, suggesting strongly that a change point model betterdescribes the data than a constant rate model.

The posterior distributions for the time of change s, and therates before the change k1 and after the change k2 are then com-puted. Fig. 4 shows a high probability density that a change in seis-micity rates occurred between December 20, 2008 and February24, 2010 with the highest density on June 13, 2009. This matchesthe expected range for time of change from a visual inspection.Fig. 5 shows the posterior distributions of seismicity rates beforeand after the change. The maximum a posteriori (MAP) estimatorsof the distributions indicate that the post-change seismicity rate isabout 300 times the pre-change rate. We also observe a narrowerprobability distribution for the post-change rate due to the occur-rence of more earthquakes, and hence more data, after 2009.

One advantage of a Bayesian model is that it provides posteriorprobability distributions for the parameters, like the time ofchange s and rate k2, as shown in Figs. 4 and 5. These distributionscan be utilized in risk estimation to account for uncertainties inparameter estimates.

4.2. Spatially varying seismicity rates

We now apply the model over the entire state to identify thoseregions where seismicity rates have changed, and to estimate thecurrent seismicity rates.

United States Geological Survey (USGS) divides a region withunmapped seismic faults into a 0.1� latitude by 0.1� longitude grid(approximately 10 km by 10 km) to estimate the rate at each gridpoint for their hazard maps generation [17], and for developingsmoothed seismicity models for induced earthquakes [8]. We usethe same uniform grid. At each grid point, we use the earthquakesobserved within a circular region of radius r ¼ 25 km to estimatethe seismicity rate at that grid point. The choice of a circular regionis made so that the earthquakes considered in the change pointmodel are within the same maximum distance from a grid point,however, the model can be implemented on any arbitrary shape.If the earthquakes support the change point model, we computethe MAP estimators for the time of change s and the post-changerate k2. Otherwise, we compute the MAP estimator for the constantrate model k0. We designate this rate as the current rate of seismic-ity at the grid point.

Figs. 6 and 7 show the MAP estimators for time of change in thestate, and the current seismicity rate, respectively. For clarity, onlythe regions with rates greater than 0:001 earthquakes per year perkm2 are shown in Fig. 7.

Page 6: Estimating spatially varying event rates with a change ...

2009 2010 2011 2012 2013 2014 2015 20160

20

40

60

80

100

120

140

Date

−100°τ

−100°

−98°

−98°

−96°

−96°

34°τ34°

36° 36°

Num

ber o

f ear

thqu

akes Full catalog

Declustered catalog

Fig. 3. The non-declustered (full) and declustered catalogs of events within 25 km of 96.7�W and 35.6� N. The white circle on the inset marks the circular region on the mapof Oklahoma. There were no observed earthquakes from 1974 to 2009, hence the date range has been shortened.

Num

ber o

f ear

thqu

akes

0

5

10

15

20

25

30

35

40

45

50

Date

2009

−06−

13

2008 2009 2010 2011 2012 2013 2014 2015 20160

0.001

0.002

0.003

0.004

0.005

0.006

0.007

0.008

0.009

0.01

Prob

abilit

y of

cha

nge

on d

ate

95% credible interval

Declustered catalog

Probability of change

Fig. 4. The probability of change on any given date s, along with the 95% credible interval between December 20, 2008 and February 24, 2010. The MAP estimator is at June13, 2009.

6 A. Gupta, J.W. Baker / Structural Safety 65 (2017) 1–11

The regions of seismicity change generally agree with regionsidentified by others as having anomalously high earthquake activ-ity [18], and the dates of change agree with other general observa-tions of a statewide seismicity increase in 2009 [19,2].

4.3. Model optimization

The model optimization approach described in Section 3.2 usesfuture events to select the model with the maximum likelihood. Tosimulate future events, we extract two mutually exclusive subsetsfrom the earthquake catalog, estimate the rates for a model on onesubset, and calculate the likelihood of this model given the eventsin the other subset. The former subset is called the training catalog,and the latter the test catalog. This is similar to the cross-validationapproach used to develop machine learning models [20].

The training catalog consists of observations from 1974 up to avarying end date. Observed earthquakes in the training catalog areused to estimate the seismicity rates, and then these rates are used

to make predictions of seismicity in the next 0.5 year or 1 year.Hence, our test catalogs contain the earthquake observations over0.5 yearor1 yearduration following theendof each trainingcatalog.

The probability gain per event, described in Eq. (25), is com-puted with ‘2 corresponding to a reference uniform rate model thatestimates equal seismicity rates at all grid points in the state bydividing the observed number of earthquakes in the training cata-log by the number of grid points. This reference model is comparedto the Bayesian change point models with different radii r of thecircular region. The model with radius that yields the highest prob-ability gain for the events in the test catalog is selected as the opti-mum model.

The probability gains G12 for 0.5 year and 1 year test catalogs forseveral choices of r and training catalog are shown in Figs. 8 and 9,respectively. It is observed that for most r and for all the recent testcatalogs, the G12 values are larger than 1, indicating that the likeli-hood of the Bayesian models is higher than the uniform rate mod-els for their respective test catalogs.

Page 7: Estimating spatially varying event rates with a change ...

Nor

mal

ized

pro

babi

lity

of ra

te

λ1,MAP = 6.2x10-5 λ2,MAP

= 1.9x10-2

Rate (earthquakes per day)10−7 10−6 10−5 10−4 10−3 10−2 10−1 1000

0.2

0.4

0.6

0.8

1Pre−change ratePost−change rate

Fig. 5. The normalized probability distribution of k1 and k2 for the selected location along with their MAP estimators.

Fig. 6. Time of change s for those parts of the state where change is detected usinga 25 km radius region.

Fig. 7. Current seismicity rates at grid points using a 25 km radius region. Forclarity, only regions with rates greater than 0.001 are shown.

A. Gupta, J.W. Baker / Structural Safety 65 (2017) 1–11 7

The highest probability gain (G12) is typically observed for aradius r of 25–35 km across all training catalogs. The highest prob-ability gains across all catalogs are obtained for the two longest

training catalogs. This indicates that a radius of the circular regionin the range of 25–35 km is best suited for this application of esti-mating spatially-varying seismicity rates using the Bayesianchange point model for induced seismicity in Oklahoma. As aresult, our previous analysis using a radius of 25 km correspondsto a model that is expected to be effective in predicting futureearthquakes. This optimal radius may vary in other regions ofinduced seismicity. The optimal radius is 25–35 km in this casedue to the 0.1� by 0.1� grid size, and likely due to the uncertaintyin earthquake locations resulting from limited seismic recordings.

Comparing the probability gains per earthquake (G12) of Fig. 8with Fig. 9, the gain is generally higher for the 0.5 year test cata-logs, than for the 1 year test catalogs. Hence, the model indicatesbetter future predictions of earthquakes over shorter timespans,as expected for a dynamic phenomenon like this.

5. Model limitations

We discuss here two limitations associated with the modeldescribed in previous sections.

5.1. Choice of priors

The choice of hyper-parameters for the prior distribution affectsthe results obtained from a Bayesian model. However, we expectthat significantly different pre-change and post-change rates willlimit the impacts of the choice of hyper-parameters on the poste-rior distributions. Data-rich regions are also expected to be lessimpacted by the choice of priors since the posterior distributionsare controlled to a greater extent by the data as the sample sizeincreases [11]. The hyper-parameter values selected in this papersimulate an infinite variance prior distribution or an uninformativeprior, where the user imposes no prior beliefs about the process[11]. We study the impacts of alternate parameter choices belowthrough application on Oklahoma data.

We first utilize the previous example of a single locationdescribed in Section 4.1 to analyze the impact of choices on ourpriors. Fig. 10 shows the MAP estimators with 95% credible inter-vals for the time of change and the rates for different prior values.

We observe from the figure that different hyper-parameter val-ues yield slightly different posterior distributions. For the time ofchange s, the posterior distributions have little variation. This is

Page 8: Estimating spatially varying event rates with a change ...

15 25 35 50 750

20

40

60

80

100

Radius of circular region, km

Gm

Training 1974-2014.5

Training 1974-2013.5

Training 1974-2015.0

Training 1974-2014.0

Training 1974-2Pr

obab

ility

gain

G12

0.5

year

test

cat

alog

Fig. 8. Probability gain per earthquake with respect to a uniform rate model for different training catalogs and radii of circular region. The test catalog is 0.5 year durationpost end of training catalog. The highlighted region emphasizes the typically higher G12 for a radius r of 25–35 km.

15 25 35 50 750

20

40

60

80

100

Radius of circular region, km

Gm

Prob

abilit

y ga

in G

121

year

test

cat

alog

Training 1974-2014.0

Training 1974-2014.5

Training 1974-2015.0

Training 1974

Training 1974-2013.5

Fig. 9. Probability gain per earthquake with respect to a uniform rate model for different training catalogs and radii of circular region. The test catalog is 1 year duration postend of training catalog. The highlighted region emphasizes the typically higher G12 for a radius r of 25–35 km.

8 A. Gupta, J.W. Baker / Structural Safety 65 (2017) 1–11

because of the significant change in seismicity rates around mid2009. The credible intervals for the pre-change rate k1 are generallylarge. This is because no events are observed before 2009 in ourdata. Hence, the posterior distribution of k1 has large varianceand is more sensitive to the choice of the prior distribution. Whenthis is contrasted with the data-rich post-change rate k2, it isobserved that the confidence intervals and the MAP estimatorsshow little variation with different prior values.

We also compare the previous statewide results with resultsobtained when using hyper-parameters kj ¼ 0:05; hj ¼ 0:1, inFigs. 11 and 12. The results are in good agreement, with differencesat the boundaries of regions with change, and in regions of lowpost-change rates. The boundaries are impacted because fewerearthquakes are observed in these regions. The regions of lowpost-change rates are impacted because the pre-change andpost-change rates are similar to each other.

Based on the comparisons, we observe that different choices ofprior distributions affect posteriors, but there is limited impact indata-rich locations. Due to a large increase in seismicity rates atmany locations, and many earthquakes being observed in thepost-change periods, there is little impact from choice of hyper-parameters on the posterior distributions for this application onthe parameters of interest, s and k2.

5.2. Assumption of a single change

The other limitation of the change point model is that itassumes a single change in the rate of events (Eq. (2)). A multiplechange point analysis is possible using Gibbs sampling [21], whichwe do not describe here. However, every change point adds twoindependent parameters to the model (additional time of change,and rate), which can introduce overfitting, and requires more data

Page 9: Estimating spatially varying event rates with a change ...

0

0.005

0.01

0.015

0.02

0.025

0.03

0

0.2

0.4

0.6

0.8

1

1.2x 10−3

2009

2010

2014

Dat

e of

cha

nge

τPr

e-ch

ange

rate

λ1

Post

-cha

nge

rate

λ2

θ j = In

f

θ j = 0.

01

θ j = 1.

0

kj = 0.05

θ j = 0.

1

2008

2011

2012

2013

2015

2016

θ j = In

f

θ j = 0.

01

θ j = 1.

0

kj = 0.5

θ j = 0.

1θ j =

Inf

θ j = 0.

01

θ j = 1.

0

kj = 5.0

θ j = 0.

1

Fig. 10. MAP estimators and 95% credible intevals for s; k1 and k2 for different hyper-parameter values. The circles are the MAP estimators at each hyper-parameter value, andthe wings represent the lower and upper limits for the 95% credible interval. The red line marks the results for the default hyper-parameters used in this paper. (Forinterpretation of the references to colour in this figure legend, the reader is referred to the web version of this article.)

A. Gupta, J.W. Baker / Structural Safety 65 (2017) 1–11 9

to reduce the variance of the posterior distributions for uninforma-tive priors. One possible approximation to the multiple changepoint analysis is to sequentially bisect the catalog at the maximum

a posteriori (MAP) estimator of the previous change point, until theBayes factor indicates a support for a no change model on all thebranches. This process of sequential bisection is not the same as

Page 10: Estimating spatially varying event rates with a change ...

Fig. 11. Regions of seismicity change for kj ¼ 0:5; hj ! 1, overlain with those forkj ¼ 0:05; hj ¼ 0:1. The light-shaded regions are common to both hyper-parameters,while the dark-shaded regions are noted only for the former hyper-parameters.

10 A. Gupta, J.W. Baker / Structural Safety 65 (2017) 1–11

a complete multiple change point analysis since each subsequentbranch is conditioned on the location of the previous change point.However, this method could serve as a rudimentary check to deter-mine whether the process should be instead modeled with amultiple-change point model.

As an example, we evaluate whether a multiple change pointmodel might be better applicable to the same single location fromSection 4.1. We use the method of sequential bisection at the MAPestimator of change point s. Here, the Bayes factor for the post-change branch is 9:1� 10�2. The pre-change branch has no events,hence has no observable change. Since both branches have Bayesfactors larger than our selected threshold of 1� 10�3, we state thata single change point model is an acceptable model for thisexample.

Fig. 12. Current seismicity rates for kj ¼ 0:5; hj ! 1

6. Conclusions

We presented a Bayesian change point methodology to detect achange in event rates for a non-homogeneous Poisson process, andevaluated spatially-varying event rates for this process. The Baye-sian methodology enables us to develop probability distributionsfor the time of change, and for the event rates before and afterthe change.

We evaluated the spatially varying event rates for a process bydividing the space into a grid and evaluating the rate at each gridpoint. Rates were evaluated based on the events observed in a cir-cular region of radius r around each point. We also presented alikelihood comparison methodology to optimize the radius r forbest future predictions of event probabilities.

We demonstrated the application of the Bayesian change pointmethodology on the spatially varying earthquake rates associatedwith induced seismicity in Oklahoma. We optimized the radius rand concluded that a radius between 25 and 35 km yields the high-est probability of observing future earthquakes in Oklahoma.

The model implementation in Oklahoma identified the regionsin the state where seismicity rates have changed. We also esti-mated the current seismicity rates using the model. Our resultswere in general agreement with other studies on time of seismicitychange [19,2], and regions of seismicity change [18]. The currentseismicity rates can be used to make short-term future predictionsof earthquakes in the state. We observed that there is better pre-diction over the next 0.5 year duration compared to the next 1 yearduration. In a future publication, we will compare the performanceof our model for future earthquake predictions with other rate esti-mation models, using the Collaboratory for the Study of Earth-quake Predictability (CSEP) tests [22].

The occurrence of seismicity change combined with estimatedseismicity rates can serve as a risk mitigation tool for operationsthat affect seismicity, for example, to prepare prioritization plansfor infrastructure inspections [23]. This information can be usedin seismic hazard and risk assessments for the region [24].

One of the possible extensions to this model for its applicationon induced seismicity could be the combination of the Bayesianchange point methodology with an earthquake catalog decluster-

(left), and with for kj ¼ 0:05; hj ¼ 0:1 (right).

Page 11: Estimating spatially varying event rates with a change ...

A. Gupta, J.W. Baker / Structural Safety 65 (2017) 1–11 11

ing approach like the epidemic type aftershock sequence (ETAS)model [25]. Combining the declustering model with the changepoint model would allow estimation of declustering parameters,in addition to seismicity rates, for the local conditions. In thispaper, the earthquake catalog declustering was done indepen-dently of the change point model implementation. This allowedfor the development of numerical algorithms to solve the changepoint model. Solving the combined declustering and change pointmodel would require random state generation algorithms like Mar-kov Chain Monte Carlo (MCMC) methods.

The Bayesian change point model presented here, along withthe methodology to assess spatially varying event rates, is a versa-tile model that can be used to estimate current event rates for anyspatially varying non-homogeneous Poisson process. Change pointmodels have been used to study DNA sequence segmentation [26],species extinction [27], financial markets [28], and software relia-bility [29]. Some of the other applications where this spatio-temporal change point model can be used are assessing spread ofdiseases, and identifying changes in climate patterns. This modelenables stakeholders to make real-time decisions about the impactof changes in event rates.

7. Resources

Earthquake catalog declustering is performed using the code by[30]. Matlab source code to perform change point calculations forOklahoma is available at https://github.com/abhineetgupta/BayesianChangePoint.

Acknowledgements

Funding for this work came from the Stanford Center forInduced and Triggered Seismicity. We would like to thank MaxWerner and Morgan Moschetti for providing feedback on the spa-tial rate assessment methodology.

References

[1] Raftery A, Akman V. Bayesian analysis of a Poisson process with a change-point. Biometrika 1986;73(1):85–9.

[2] Wang P, Pozzi M, Small MJ, Harbert W, Statistical method for early detection ofchanges in seismic rate associated with wastewater injections, Bull SeismolSoc Am. doi:http://dx.doi.org/10.1785/0120150038.<http://www.bssaonline.org/content/early/2015/10/05/0120150038>.

[3] Ellsworth WL. Injection-induced earthquakes. Science 2013;341(6142):1225942. doi: http://dx.doi.org/10.1126/science.1225942. <http://www.sciencemag.org/content/341/6142/1225942>.

[4] Keranen KM, Weingarten M, Abers GA, Bekins BA, Ge S. Sharp increase incentral Oklahoma seismicity since 2008 induced by massive wastewaterinjection. Science 2014;345(6195):448–51. doi: http://dx.doi.org/10.1126/science.1255802. <http://www.sciencemag.org/content/345/6195/448>.

[5] Weingarten M, Ge S, Godt JW, Bekins BA, Rubinstein JL. High-rate injection isassociated with the increase in U.S. mid-continent seismicity. Science2015;348(6241):1336–40. doi: http://dx.doi.org/10.1126/science.aab1345.<http://www.sciencemag.org/content/348/6241/1336>.

[6] Council NR. Induced seismicity potential in energy technologies. Washington,DC: The National Academies Press; 2013. <http://www.nap.edu/catalog/13355/induced-seismicity-potential-in-energy-technologies>.

[7] Walters RJ, Zoback MD, Baker JW, Beroza GC, Characterizing and responding toseismic risk associated with earthquakes potentially triggered by saltwaterdisposal and hydraulic fracturing. <http://www.gwpc.org/sites/default/files/files/Walters>.

[8] Petersen MD, Mueller CS, Moschetti MP, Hoover SM, Rubinstein JL, Llenos AL,Michael AJ, Ellsworth WL, McGarr AF, Holland AA. Incorporating inducedseismicity in the 2014 United States national seismic hazard model – results of2014 workshop and sensitivity studies, open-file report 20151070. Reston,Virginia: U.S. Geological Survey; 2015.

[9] Gupta A, Baker JW. A Bayesian change point model to detect changes in eventoccurrence rates, with application to induced seismicity. In: 12th InternationalConference on Applications of Statistics and Probability in Civil Engineering(ICASP12). <https://circle.ubc.ca/handle/2429/53235>.

[10] Kass RE, Raftery AE. Bayes Factors. J Am Stat Assoc 1995;90(430):773–95. doi:http://dx.doi.org/10.1080/01621459.1995.10476572. <http://amstat.tandfonline.com/doi/abs/10.1080/01621459.1995.10476572>.

[11] Gelman A, Carlin JB, Stern HS, Rubin DB. Bayesian data analysis. Taylor &Francis; 2014. <http://www.tandfonline.com/doi/full/10.1080/01621459.2014.963405>.

[12] Werner MJ, Helmstetter A, Jackson DD, Kagan YY. High-resolution long-termand short-term earthquake forecasts for California. Bull Seismol Soc Am2011;101(4):1630–48. doi: http://dx.doi.org/10.1785/0120090340. <http://www.bssaonline.org/content/101/4/1630>.

[13] Oklahoma Geological Survey, University of Oklahoma, Earthquake catalogavailable on the World Wide Web. <http://www.ou.edu/content/ogs/research/earthquakes/catalogs.html>.

[14] Gardner JK, Knopoff L. Is the sequence of earthquakes in Southern California,with aftershocks removed, Poissonian? Bull Seismol Soc Am 1974;64(5):1363–7. <http://bssa.geoscienceworld.org/content/64/5/1363>.

[15] T. van Stiphout, J. Zhuang, D. Marsan, Seismicity declustering, CommunityOnline Resource for Statistical Seismicity Analysis 10.http://www.corssa.org/articles/themev/vanstiphoutetal

[16] Reasenberg P. Second-order moment of central California seismicity. JGeophys Res 1985;90(B7):5479. doi: http://dx.doi.org/10.1029/JB090iB07p05479. <http://doi.wiley.com/10.1029/JB090iB07p05479>.

[17] Petersen MD, Moschetti MP, Powers PM, Mueller CS, Haller KM, Frankel AD,Zeng Y, Rezaeian S, Harmsen SC, Boyd OS, Field N, Chen R, Rukstales KS, Luco N,Wheeler RL, Williams RA, Olsen AH, Documentation for the 2014 update of theUnited States national seismic hazard maps, Tech. rep., U.S. Geological SurveyOpen-File Report 2014–1091 (Jul. 2014). <http://pubs.usgs.gov/of/2014/1091/>.

[18] Ellsworth W, Llenos A, McGarr A, Michael A, Rubinstein J, Mueller C, PetersenM, Calais E. Increasing seismicity in the U. S. midcontinent: Implications forearthquake hazard. Lead Edge 2015;34(6):618–26. doi: http://dx.doi.org/10.1190/tle34060618.1. <http://library.seg.org/doi/full/10.1190/tle34060618.1>.

[19] Llenos AL, Michael AJ. Modeling earthquake rate changes in Oklahoma andarkansas: possible signatures of induced seismicity. Bull Seismol Soc Am2013;103(5):2850–61. doi: http://dx.doi.org/10.1785/0120130017. <http://www.bssaonline.org/content/103/5/2850>.

[20] Mitchell TM. Machine learning. 1st ed. New York, NY, USA: McGraw-Hill Inc;1997.

[21] Stephens DA. Bayesian retrospective multiple-changepoint identification. J RStat Soc Ser C (Appl Stat) 1994;43(1):159–78. doi: http://dx.doi.org/10.2307/2986119. <http://www.jstor.org/stable/2986119>.

[22] Zechar JD, Schorlemmer D, Liukis M, Yu J, Euchner F, Maechling PJ, Jordan TH.The collaboratory for the study of earthquake predictability perspective oncomputational earthquake science. Concurrency Comput: Pract Experience2010;22(12):1836–47. doi: http://dx.doi.org/10.1002/cpe.1519. <http://onlinelibrary.wiley.com/doi/10.1002/cpe.1519/abstract>.

[23] Hill C, Oklahoma DOT hires engineering firm to create earthquake responseprotocol for bridge inspections (Apr. 2015). <http://www.equipmentworld.com/oklahoma-dot-hires-engineering-firm-to-create-earthquake-response-protocol-for-bridge-inspections/>.

[24] Baker JW, Gupta A. Bayesian treatment of induced seismicity in probabilisticseismic-hazard analysis. Bull Seismol Soc Am 2016. doi: http://dx.doi.org/10.1785/0120150258. <http://www.bssaonline.org/content/early/2016/05/04/0120150258>.

[25] Ogata Y, Zhuang J. Spacetime ETAS models and an improved extension.Tectonophysics 2006;413(12):13–23. doi: http://dx.doi.org/10.1016/j.tecto.2005.10.016. <http://www.sciencedirect.com/science/article/pii/S0040195105004889>.

[26] Braun JV, Muller H-G. Statistical methods for DNA sequence segmentation.Stat Sci 1998;13(2):142–62. <http://www.jstor.org/stable/2676755>.

[27] Solow AR. Inferring extinction from sighting data. Ecology 1993;74(3):962–4.doi: http://dx.doi.org/10.2307/1940821. <http://www.jstor.org/stable/1940821>.

[28] Liu X, Wu X, Wang H, Zhang R, Bailey J, Ramamohanarao K. Mining distributionchange in stock order streams. In: 2010 IEEE 26th international conference ondata engineering (ICDE). p. 105–8. doi: http://dx.doi.org/10.1109/ICDE.2010.5447901.

[29] Zou F-Z. A change-point perspective on the software failure process. SoftwareTest Verification Reliab 2003;13(2):85–93. doi: http://dx.doi.org/10.1002/stvr.268. <http://onlinelibrary.wiley.com/doi/10.1002/stvr.268/abstract>.

[30] Wiemer S. A software package to analyze seismicity: ZMAP. Seismol Res Lett2001;72(3):373–82. doi: http://dx.doi.org/10.1785/gssrl.72.3.373. <http://srl.geoscienceworld.org/content/72/3/373>.