Global Earthquake Data : Local e.q. catalogs tend to have problems, esp. missing data.

36
1 Some Current Problems in Point Process Research: 1. Prototype point processes 2. Non-simple point processes 3. Voronoi diagrams

description

Some Current Problems in Point Process Research: 1. Prototype point processes 2. Non-simple point processes 3. Voronoi diagrams. Global Earthquake Data : Local e.q. catalogs tend to have problems, esp. missing data. 1977: Harvard (global) catalog created. - PowerPoint PPT Presentation

Transcript of Global Earthquake Data : Local e.q. catalogs tend to have problems, esp. missing data.

1

Some Current Problems in Point Process Research:

1. Prototype point processes 2. Non-simple point processes3. Voronoi diagrams

2

Global Earthquake Data:

Local e.q. catalogs tend to have problems, esp. missing data.1977: Harvard (global) catalog created. Considered the most complete. Errors best understood.

A collection of aftershock sequences:• Harvard Catalog, 1/1/77 to 3/1/03• Shallow events only (depth < 70km)• Mw 7.5 to 8.0• Aftershocks: Mw > 5.5, within 100km, 0.133 days - 2 yrs.• No Mw ≥ 7.5 within 200km in previous 2 yrs.• No Mw ≥ 8.0 w/in 400km within 4 yrs (Molchan et al., 1997)

• 49 mainshocks, avg. 5.47 aftershocks, SD = 4.3.

3

4

1. Prototypes.

Some motivating questions:

A) What does a typical aftershock sequence look like?

B) How can we tell if a particular sequence is an outlier?

C) How can we group aftershock sequences into clusters based on the similarity of their features?

5

A) What does the typical aftershock sequence look like?

e.g. What is typically observed after an eq of Mw 7.5 - 8.0?

• Modified Omori: K/(t+c)p

6

A) What does the typical aftershock sequence look like?

e.g. What is typically observed after an eq of Mw 7.5 - 8.0?• Modified Omori: K/(t+c)p

• May desire a prototype: a point pattern of min. distance to those observed.

Requires distance between point patterns.

7

8

Victor-Purpura (1997) distanceAB AB

Given two point patterns:• Match each point in A to the nearest point in B and record

the horizontal distance moved (penalty pm=1 per unit moved)

• Delete excess points (with penalty pa)

9

Considerations

10

Calculating the distance between two point patterns:

• Reduces to which points are kept and which are removed.

• Mutual nearest neighbors within 2pa/pm are automatically kept.

• A point > 2pa /pm from its nearest neighbor is automatically removed.

11

Prototype Point Pattern

• Defined such that the sum of distances from the prototype to all observed point patterns in the data set is minimized.

• Represents a “typical” observation.

12

Some properties of the prototype

• Prototype is not necessarily unique.

• There exists a prototype pattern composed entirely of points in the dataset.

• In fact, a prototype can be found such that each point it contains is the median of its associated points in distance calculations.

13

Uses: Data summary, outlier identification, clustering, …

14

15

16

17

Clusters of aftershock sequences

• Distance of each aftershock sequence to the prototypes for time and magnitude

18

Cluster Map

19

With multidimensional point processes (time, mw, location):

• No simple sequential pairing.

• Mutual nearest neighbors are kept.

• There exists a prototype consisting only of points whose coordinates are medians of coordinates of associated pts.

20

21

22

23

2. Non-simple point processes.

Simple point processes are characterized by the conditional intensity, (t). But what about non-simple point processes?

24

Two types of simplicity, for multi-dimensional point processes:1) Completely simple:

No two points overlap exactly: the same triple (t,x,a).2) Simple ground process:

No two points at exactly the same time.

Multi-dimensional & marked point processes are only uniquely characterized by the conditional intensity (t,x,m) if they have simple ground process.

Poisson process with intensity = 2:

Poisson process with intensity = 1,but with each point doubled:

Both have the same conditional intensity! ( = 2)

25

Multi-dimensional & marked point processes are only uniquely characterized by the conditional intensity (t,x,m) if they have simple ground process.

How can one model non-simple point processes?

t

m1

m2

m1

m2

2 independent Poisson processes A Poisson process with = 1,each with intensity = 1. and an exact copy. (t, m1) = (t, m2) = 1. (t, m1) = (t, m2) = 1.

t

26

How can one model non-simple marked point processes?

t

m1

m2

m3

m1

m2

m3

m12

m13

m23

Consider an extended mark space, consisting of pairs (and triplets, quadruplets, etc.) of marks:

Z = {m1, m2, m3, m12={m1, m2}, m13, m23}.• The resulting point process will have simple ground process. (Daley & Vere-Jones 1988, p208)• The conditional intensity ’(t, m) of the resulting process can be written in terms of the original conditional intensity (t, m): For instance, (t, m2) = ’(t,m2) + ’(t, m12) + ’(t, m23).• Can have models where ’(t,mij) = ’(t,mi) ’(t,mj). (Schoenberg 2005)

t

27

3. Voronoi Tessellations.

Given a collection of points p1, p2, …,divide the space into cells C1 , C2 , …,such that cell Ci consists of all locations closer to pi than to any of the other points pj.

Ci = {x : ||x - pi || < ||x - pj|| for all j}.

28

29

30

31

32

33

Southern California Earthquake Center (SCEC) data

Lat: 32-37 (733 km)Lon: (-114, -112) (556 km)Time: (1/1/1984 - 17/6/2004)Mag: Mo > 2.n = 6796.

Errors in the catalog:•Missing earthquakes, esp in clusters.•Discrimination problems.•Location & projection errors.

34

Many models for the cell characteristics were fitted:Frechet, gamma, lognormal, exponential, Pareto, tapered Pareto.Pareto: F(x) = 1 - (a/x).Tapered Pareto: F(x) = 1 - (a/x) e(a-x)/.

35

Q-Q plots for the tapered Pareto: F(x) = 1 - (a/x) e(a-x)/.

Cell area Cell perimeter

36

Summary and Open Questions:

1) Prototypes may be useful data summaries for point processes, and for clustering, identifying outliers, etc.Prototypes for particular models? Applications?

2) Non-simple point processes can be viewed as simpleon an extended mark space. More non-simple models? Applications?

3) Cell sizes in Voronoi tessellations of earthquake data seem to be tapered Pareto distribution (like many other features of earthquakes). Why? What is the theoreticalcell size distribution for a particular model? Other applications?