Implementation of Strauss Point Process Model to ... · section of tissue, cases of some disease in...

56
Implementation of Strauss Point Process Model to Earthquake Data Salma Anwar May, 2009

Transcript of Implementation of Strauss Point Process Model to ... · section of tissue, cases of some disease in...

Page 1: Implementation of Strauss Point Process Model to ... · section of tissue, cases of some disease in a geographical region, collection of forest fire locations etc. Analysis of point

Implementation of Strauss Point Process Model to Earthquake Data

Salma Anwar May, 2009

Page 2: Implementation of Strauss Point Process Model to ... · section of tissue, cases of some disease in a geographical region, collection of forest fire locations etc. Analysis of point

Implementation of Strauss Point Process Model to Earthquake Data

by

Salma Anwar Thesis submitted to the International Institute for Geo-information Science and Earth Observation in partial fulfilment of the requirements for the degree of Master of Science in Geo-information Science and Earth Observation, Specialisation: (fill in the name of the specialisation) Thesis Assessment Board Chair: Prof. Dr. J.L. van Genderen External examiner: Dr.Ir. G.B.M. Heuvelink Supervisor: Prof. Dr. A. Stein (RTL) Second supervisor: Prof. Dr. J.L. van Genderen

INTERNATIONAL INSTITUTE FOR GEO-INFORMATION SCIENCE AND EARTH OBSERVATION

ENSCHEDE, THE NETHERLANDS

Page 3: Implementation of Strauss Point Process Model to ... · section of tissue, cases of some disease in a geographical region, collection of forest fire locations etc. Analysis of point

To My Mother

Page 4: Implementation of Strauss Point Process Model to ... · section of tissue, cases of some disease in a geographical region, collection of forest fire locations etc. Analysis of point

Disclaimer This document describes work undertaken as part of a programme of study at the International Institute for Geo-information Science and Earth Observation. All views and opinions expressed therein remain the sole responsibility of the author, and do not necessarily represent those of the institute.

Page 5: Implementation of Strauss Point Process Model to ... · section of tissue, cases of some disease in a geographical region, collection of forest fire locations etc. Analysis of point

i

Abstract

Environmental spatial processes are usually determined by a range of ambient factors that are sometimes hard to quantify. As a consequence, their size and significance are difficult to determine numerically. Finding a suitable model for spatial point processes has thus been a major challenge to spatial statistics. This study considers earthquakes as a marked point process. For earthquakes, large and complex data sets exist including many possibly relevant covariates that may influence their occurrence, posing practical problems and introducing complex statistical issues. The use of spatial statistics for earthquakes thus requires investigation of the potentials and flexibility of different available models. For the underlying study, different techniques offered by spatial statistics were explored to analyze earthquake data in Pakistan recorded since 1973, with a major event occurring in 2005. The Strauss point process model was investigated for its flexibility to incorporate available geological information such as the presence of faults and plate boundaries as explanatory variables and for its appropriateness to model this marked and clustered pattern. The results showed that the model, despite some limitations, is rigorous for applying it to such a marked point pattern, representing well the clustering behaviour as determined by a number of environmental factors.

Page 6: Implementation of Strauss Point Process Model to ... · section of tissue, cases of some disease in a geographical region, collection of forest fire locations etc. Analysis of point

ii

Acknowledgements

Praise be to Allah, the most gracious and the most merciful. For being able to accomplish the present study, first of all I am thankful to my country for providing us funds for studying abroad, it makes me highly determined to pay back to my country in the form of dedicated services after completing the studies. I am grateful to Prof. Dr. Atta-ur-Rehman, ex-chairman Higher Education Commission of Pakistan, for taking initiative of introducing this overseas scholarship program. I am highly obliged to my supervisors, Prof. Alfred Stein and Prof. John van Genderen for their kind supervision, great ideas and thorough guidance throughout the study. I am especially thankful to Prof. Alfred Stein for always being patient and understanding my situations and to Prof. John van Genderen for always being so encouraging. I extend my deep gratitude to Prof. Adrian Baddeley for helping me with spatstat problems and to Dr. Nicholas Humm and Dr. Valentine Tolpekin for their precious time whenever I needed it for the study. I find no words to thank uncle Muhammad Ishaaq and aunty Sabine Ishaaq for their help and parental affection, without which it could not be possible for me to continue studying. I wish I could ever be able to show my gratitude to them. I am also thankful to my sister Shahida Faisal for always being there to help and encourage me through all circumstances. And to Amna for her friendship which has always been a source of relaxation and strengthening during tough times. Finally I am thankful to my husband Muhammad Yaseen and my kids, Asma and Muhammad Taha Yaseen, for bringing life back to my life.

Page 7: Implementation of Strauss Point Process Model to ... · section of tissue, cases of some disease in a geographical region, collection of forest fire locations etc. Analysis of point

iii

Table of contents

1. Introduction ......................................................................................................................................1 1.1. Introduction.............................................................................................................................1 1.2. Objectives ...............................................................................................................................3

1.2.1. Sub-objectives ....................................................................................................................3 1.3. Research questions..................................................................................................................3

2. Study Area........................................................................................................................................5 2.1. Exploratory Data Analysis and Selection of the Study Area ..................................................5 2.2. Geology of the Study Area .....................................................................................................8 2.3. Data Description ...................................................................................................................13

2.3.1. Depth of hypocentre .........................................................................................................13 2.3.2. Orientation of earthquake epicentres................................................................................14

2.3.2.1. With respect to previous earthquake........................................................................14 2.3.2.2. With respect to the major earthquake ......................................................................16 2.3.2.3. With respect to plate boundary ................................................................................17

3. Spatial Point Patterns......................................................................................................................19 3.1. Introduction...........................................................................................................................19 3.2. Spatial point process .............................................................................................................19 3.3. Conditional Intensity.............................................................................................................20 3.4. Poisson Point Process ...........................................................................................................21 3.5. Nearest-Neighbor G-Function ..............................................................................................21 3.6. Gibbs Models........................................................................................................................22 3.7. Strauss Point Process ............................................................................................................23 3.8. Multi-type Marked Point Pattern and Strauss Model............................................................25 3.9. Model Selection ....................................................................................................................25

4. Methodology ..................................................................................................................................26 5. Results ............................................................................................................................................28 6. Discussion ......................................................................................................................................41 7. Conclusions ....................................................................................................................................44 References ..............................................................................................................................................45

Page 8: Implementation of Strauss Point Process Model to ... · section of tissue, cases of some disease in a geographical region, collection of forest fire locations etc. Analysis of point

iv

List of figures

Figure 2.1: Earthquakes since 1973 .........................................................................................................5 Figure 2.2: Annual distribution of Earthquakes .......................................................................................6 Figure 2.3: Aftershocks of Kashmir earthquakes within one month........................................................7 Figure 2.4: Study Area and aftershocks Kashmir earthquake (M>= 4) within one month ......................8 Figure 2.5: Geo-referenced Tectonic map of Pakistan.............................................................................9 Figure 2.6: Geo-referenced geological map of the northern Pakistan....................................................10 Figure 2.7: Plate boundary location .......................................................................................................10 Figure 2.8: Study area, locations of plate boundary and tectonic faults along with epticenters.............11 Figure 2.9: Pixel image showing distances from plate boundary...........................................................12 Figure 2.10: Subset of pixel image for the study area............................................................................12 Figure 2.11: Pixel image of the study area showing distances from the Geological faults....................13 Figure 2.12: Depths of hypocenters .......................................................................................................14 Figure 2.13: Angle of occurrence for each earthquake w.r.t. previous ..................................................14 Figure 2.14: Angle of each earthquake epicentre w.r.t. the previous .....................................................15 Figure 2.15: Circular data plot and Rose diagram..................................................................................16 Figure 2.16: Circular data plot and Rose diagram..................................................................................17 Figure 3.1: Simulations of Strauss model...............................................................................................24 Figure 4.1: Methodology adopted for analysis and model fitting of earthquake data............................26 Figure 5.1: Distribution of small and large earthquakes ........................................................................28 Figure 5.2: Intensity of small and large earthquakes..............................................................................29 Figure 5.3: Nearest neighbour G function for pairs of small earthquakes .............................................29 Figure 5.4: Nearest neighbour G function for pairs of small and large earthquakes..............................30 Figure 5.5: Nearest neighbour G function for pairs of large and small earthquakes.............................30 Figure 5.6: Nearest neighbour G function for pairs of large earthquakes ..............................................31 Figure 5.7: Nearest neighbour G functions for pairs of points of all types ............................................32 Figure 5.8: Trends fitted by the model to small and large earthquakes..................................................34 Figure 5.9: A simulated realization of the fitted model..........................................................................35 Figure 5.10: Simulated envelopes for the fitted model ..........................................................................35 Figure 5.11: Trends fitted by the model for small and large earthquakes ..............................................37 Figure 5.12: Trends fitted by the model to small and large earthquakes................................................38 Figure 5.13: Trends fitted by the model to small and large earthquakes................................................40

Page 9: Implementation of Strauss Point Process Model to ... · section of tissue, cases of some disease in a geographical region, collection of forest fire locations etc. Analysis of point

v

List of tables

Table 2.1: Month wise Distribution of Kashmir Earthquake aftershocks of M>= 4................................6

Page 10: Implementation of Strauss Point Process Model to ... · section of tissue, cases of some disease in a geographical region, collection of forest fire locations etc. Analysis of point

IMPLEMENTATION OF STRAUSS POINT PROCESS MODEL TO EARTHQUAKE DATA

1

1. Introduction

1.1. Introduction

Point pattern statistics have found its usefulness and applicability in a variety of scientific fields from epidemiology (Gatrell et al., 1996) (Siqueira et al., 2004) to ecology (Mateu et al., 1998) (Barot et al., 1999) (Perry et al., 2006) (Turner, 2007), forestry (Stoyan and Penttinen, 2000), seismology (Vere-Jones and Li, 1997) (Holden et al., 2003), economics (Mateu et al.), environmental science (Walter et al., 2005) and astrophysics (Kerscher et al., 1999). (Stein and Georgiadis, 2008) serves as a very good example of a combination of geographic information tools and spatial statistics. In our daily life we come across many examples of spatial point patterns like locations of trees in a forest, nests in a breeding colony of birds, positions of human settlements, nuclei in a microscopic section of tissue, cases of some disease in a geographical region, collection of forest fire locations etc. Analysis of point pattern is essentially a first step in order to understand the mechanism responsible for the formation of point pattern. For example by inspecting the spatial distribution of some disease cases in a region we might get useful hints to the factors responsible for the spread of the disease(Besag and Newell, 1991) (Selvin and Merrill, 2002), as Dr John Snow did in the wake of cholera outbreak in London in mid 1850s. He plotted the distribution of deaths in London on a map which revealed unusually high number of deaths near a water pump on Broad Street. After his findings the pump's handle was removed and the number of cholera deaths was dramatically reduced. Although Dr Snow plotted bars instead of points to represent the number of deaths at the specified households, Dr Snow’s work can be considered as one of the earliest works on point pattern analysis in epidemiology. Similarly analysis of human population distribution in a large country can give us indication of economically active zones of the region as population distribution pattern and economy are closely interrelated (Albert et al., 2002). Point pattern analysis of earthquake data can also be very helpful in understanding the distinctive geological and geophysical characteristics of the seismic locations which in turn can give insight into the whole mechanism responsible for the earthquake distribution pattern (Zheng and Vere-Jones, 1991). Usually a point pattern analysis, apart from other characteristics of the pattern, consists of determining possible dependence or interaction between the points. Finding the dependence and suggesting some suitable model for the data that could explain the dependence among points is necessarily a subtle task since for point patterns “theoretical models are fundamentally intractable from an analysis point of view”(Albert et al., 2002) (Turner, 2007). Investigating the effects of covariates is an essential part of point pattern analysis and requires careful consideration of the techniques for a realistic model fitting. In the absence of dependence among points the task is much simplified by the Poisson point process models in which the covariates effects are investigated using likelihood ratio tests. If there is dependence or interaction among points, however the likelihood ratio technique fails to fit realistic model to the data. In that case Poisson process model, also termed as the model for complete spatial randomness, serves as a reference models or as a null hypothesis when testing the point pattern data for the presence of interaction. As an alternative to the Poisson point process model, a model which can best describe the interaction between the points in a point pattern should be considered.

Page 11: Implementation of Strauss Point Process Model to ... · section of tissue, cases of some disease in a geographical region, collection of forest fire locations etc. Analysis of point

IMPLEMENTATION OF STRAUSS POINT PROCESS MODEL TO EARTHQUAKE DATA

2

A general class of models known as Markov point process deals with modelling inter-point interactions. Pair-wise interaction models are a special case of Markov point processes introduced by (Ripley, 1977). Strauss point process (Strauss, 1975), (Kelly and Ripley, 1976) is an example of the pair wise interaction models which was originally proposed as a model of clustering but later turned out to be a model that represents only repulsive behaviours of a point pattern. An extension to Strauss model known as Multi-type Strauss model is, however, available that can take clustering effects of a point process into account for modelling a point pattern. If the distribution of points in a point pattern is considered to be affected by a number of environmental, geological and geophysical factors, incorporation of all available additional information about the process can help in accurate analysis and improved modelling of the underlying point process. Modelling and extracting information from a point pattern with multiple types of marks and which is also suspected of being influenced by the covariate effects, however remains a challenge in point process statistics (Illian, 2008). Data of earthquake epicentre locations serve as a good example of spatial point pattern since the epicentres locations can be regarded as two dimensional points (X, Y) in geographical space. If the earthquake data consists of the distribution of hypocentres, the data can be considered as a spatial point pattern in three dimensions as a hypocentre is a point beneath the earth’s surface right below the epicentre location where the rupture of an earthquake begins. As we get magnitude of each earthquake along with its epicentre location, the earthquake data can further best be represented as a marked point pattern. The earthquakes locations are found in forms of clusters on the globe and their distribution are determined by many observed and unobserved geological and geophysical characteristics of the epicentre region. Modelling the data constituting the locations of earthquakes epicentre requires some model that can best represent the clustering behaviour of the earthquakes by take into account the effects of the external factors. Strauss point process model can be applied to the earthquake data to investigate its potential to model clustered data. . Concepts of circular statistics have been used by researchers to analyze different data dealing with orientations or cyclic patterns (Brunsdon and Corcoran, 2006) (Willson, 2005) (Aguilar, 2002). Directions of epicentres locations with respect to some prominent geological features of the study area can constitute additional information about the seismic pattern of the region. In this case circular statistics may play role as a tool to analyze the earthquakes data for finding some significant trends in the directions of the occurrence of the earthquakes and their aftershocks. Circular data refer to the data with two or three dimensional orientations, measured in the form of angles .The key point distinguishing circular data from linear data is its cyclic nature. Due to this property a statistical analysis of the circular data differs from the analysis of linear data in the sense that calculation of even simple statistics such as the arithmetic mean requires a different approach than the one used for linear data. Modelling earthquake data has since long been a focus of research by seismologists and statisticians (Schoenberg, 2003) (Vere-Jones and Li, 1997) (Zhuang, 2000) as an attempt to predict the behaviour of the earthquakes and their aftershocks occurrences so as to be able to safeguard humanity against the vast destruction and panic caused by earthquakes. An earthquake of magnitude 7.6 stuck the northern part of Pakistan on October 8, 2005 followed by a series of aftershocks with magnitudes ranging between 2.9 and 6.2. The area has been observing intense seismic activities in history experiencing large destructive earthquakes which have been causing great devastations and misery to the region.

Page 12: Implementation of Strauss Point Process Model to ... · section of tissue, cases of some disease in a geographical region, collection of forest fire locations etc. Analysis of point

IMPLEMENTATION OF STRAUSS POINT PROCESS MODEL TO EARTHQUAKE DATA

3

There is need to analyze the earthquake data in order to better understand the seismicity of the area. The current analysis aims at finding and analyzing the prominent features of the available data set considering it as a marked point pattern and a directional data and then trying Strauss point process model to represent the earthquake distribution in geographical space.

1.2. Objectives

The main objective of the study is to investigate potential of the Strauss point process model in modelling the spatial distribution of earthquake epicentres locations.

1.2.1. Sub-objectives

1. To make an exploratory analysis of the earthquake data considering it as a point pattern and a directional data simultaneously.

2. To characterize the earthquake data utilizing the techniques presented by point pattern theory.

3. To apply the Strauss point process model to earthquake data.

4. To extend the Strauss point process modelling of the earthquake epicentre locations data with consideration of available geological and geographical information.

5. To determining how specific spatial covariates contribute to the distribution of aftershocks of

a major earthquake.

1.3. Research questions

1. Is the point pattern theory adequately applicable to the earthquakes epicentre locations data?

2. How can the earthquake data be treated as a directional data? Do the orientations of the earthquake epicentres contain some useful information that can be helpful in modelling the pattern?

3. Do the earthquake data serve as a good example of a Strauss process?

4. How can the Strauss point process model be fitted to earthquake data?

5. How can effects of covariates be measured on the earthquake point pattern using the Strauss

point process model?

6. Does inclusion of covariate data improve the Strauss point process modelling?

Page 13: Implementation of Strauss Point Process Model to ... · section of tissue, cases of some disease in a geographical region, collection of forest fire locations etc. Analysis of point
Page 14: Implementation of Strauss Point Process Model to ... · section of tissue, cases of some disease in a geographical region, collection of forest fire locations etc. Analysis of point

IMPLEMENTATION OF STRAUSS POINT PROCESS MODEL TO EARTHQUAKE DATA

5

Northern Pakistan

2. Study Area

2.1. Exploratory Data Analysis and Selection of the Study Area

The point pattern chosen for the analysis and model fitting is the earthquake data of the Northern part of Pakistan. This region is considered as an active seismic zone. Data were taken from the USGS website (http://earthquake.usgs.gov/). They include 1403 earthquakes that occurred in the region between January 1973 and August 2008 shown in figure 2.1.

Figure 2.1: Earthquakes since 1973

Temporal distribution of the data as shown in the figure 2.2 shows that the year 2005 is marked by a large seismic activity in the region as compared to the previous years. This is due to a major shock, the Kashmir earthquake of magnitude 7.6, which struck the region on Oct 8, 2005 followed by a range of aftershocks, causing great devastation and misery by killing more than 80000 people and damaging the whole infrastructure of the region. This event clearly changed the whole seismic history of the area.

Page 15: Implementation of Strauss Point Process Model to ... · section of tissue, cases of some disease in a geographical region, collection of forest fire locations etc. Analysis of point

IMPLEMENTATION OF STRAUSS POINT PROCESS MODEL TO EARTHQUAKE DATA

6

Figure 2.2: Annual distribution of Earthquakes

Further exploratory analysis of the basic features of the data revealed that there were 22 earthquakes of magnitude≥ 5.5, out of which 12 occurred the same day as the major earthquake and 15 earthquakes of magnitude≥ 5.5 occurred within 15 days after the Kashmir earthquake. Only 7 other earthquakes of magnitude≥ 5.5 occurred during the past 35 years. This difference of the seismic year 2005 from the past 35 years convinced us to focus our attention on the analysis of Kashmir earthquake and its aftershocks.

Table 2.1: Month wise Distribution of Kashmir Earthquake aftershocks of M>= 4

Month Frequency

8Oct- 7Nov 292

8Nov- 7Dec 16

8Dec- 7Jan 17

8Jan- 7Feb 7

8Feb- 7Mar 7

8Mar- 7Apr 10

8Apr- 7May 5

8May- 7Jun 2

8Jun- 7Jul 3

8Jul- 7Aug 3

8Aug- 7Sep 0

8Sep- 7Oct 3

Page 16: Implementation of Strauss Point Process Model to ... · section of tissue, cases of some disease in a geographical region, collection of forest fire locations etc. Analysis of point

IMPLEMENTATION OF STRAUSS POINT PROCESS MODEL TO EARTHQUAKE DATA

7

Table 2.1 shows the month-wise seismic record of the area for magnitude≥ 4 since Oct8, 2005 (smaller magnitudes were neglected as such earthquakes can hardly be felt and cause us damage). The table shows that the seismicity of the area decreases after the first month and that the number of events in the preceding months is almost negligible as compared to the first month after the main shock. We thus narrowed down to an analysis of earthquake data during the first month after the main shock of Kashmir earthquake. Figure 2.3 shows the epicentre locations of the Kashmir Earthquake and its aftershocks (M≥ 4) within one month of the main shock.

Figure 2.3: Aftershocks of Kashmir earthquakes within one month

Figure 2.3 shows that all earthquakes within one month after the Kashmir earthquake are likely being the aftershocks of the major event, because their locations are in a close vicinity of the Kashmir earthquake. Only four earthquake locations lie more than 50 km from the aftershocks region and hence do not seem to be a part of the seismicity resulting from the Kashmir earthquake. These earthquakes were further investigated and it was found that they are of magnitudes 4, 4.1, 4.5 and 4.8 respectively, all being less than 5.5 which was taken as a lower threshold value to define a large earthquake later in the study, therefore it seems reasonable to ignore these data points in the study of aftershocks and limit the study area to the part where the aftershocks of the Kashmir earthquake occurred within one month. The resulting study area and positions of the 288 earthquakes is shown in figure 2.4. The geographical bounding coordinates of the study area are (274963, 3849631), (348760, 3722997), (445264,

Page 17: Implementation of Strauss Point Process Model to ... · section of tissue, cases of some disease in a geographical region, collection of forest fire locations etc. Analysis of point

IMPLEMENTATION OF STRAUSS POINT PROCESS MODEL TO EARTHQUAKE DATA

8

Kashmir EQ 8 Oct, 2005 M=7.6

3779328), (371467, 3906398) in meters (The coordinate system used throughout the present study is WGS-84 with projection system UTM zone 42 North), further analysis of the data considering it as a point pattern will be based on this study region and the earthquakes located within it.

Figure 2.4: Study Area and aftershocks Kashmir earthquake (M>= 4) within one month

2.2. Geology of the Study Area

Different geological features are often direct manifestations of the subsurface forces of dynamic earth; therefore they are important for the understanding of a variety of problems related to physical processes happening deep down the earth’s surface. Earthquakes are associated with tectonic activity beneath the earth’s surface, and therefore several geological features are associated with high probability of earthquakes occurrence (Yaseen, 2009). “The epicentre region lies on the western edge of the Himalayan Arc, which denotes the area of continental convergence between the Indian and Eurasian tectonic plates. The Indian plate moves northwards at a rate of about 40mm/year and subducts below the Eurasian plate. The resulting compression and uplift at this plate boundary over millions of years has resulted in the formation of the Tibetan Plateau (average elevation of 4600m above sea level), the Himalayan mountain ranges (which have peaks reaching up to 8,854 m above sea level), as well as the Karakoram, Pamir and Hindu Kush. Compression motion between the two tectonic plates results in a series of large thrust (or reverse) faults in the foothills of these mountains. These include the Main Karakorum thrust (MKT), the Main boundary thrust (MBT), and the Main Mantle thrust (MMT), which tend to dip NE below the

Page 18: Implementation of Strauss Point Process Model to ... · section of tissue, cases of some disease in a geographical region, collection of forest fire locations etc. Analysis of point

IMPLEMENTATION OF STRAUSS POINT PROCESS MODEL TO EARTHQUAKE DATA

9

mountain ranges. The structure on which the main shock occurred is called the Hazara Syntaxis” (Arup et al., 2005). The MBT is the main tectonic fault which serves as a boundary between the colliding continental crusts of the Eurasian plate on the northern side and the Indian plate on the southern side. The Kashmir earthquake (October 8, 2005) is associated with fault rupture near the western end of the MBT in Kashmir region of Northern Pakistan. Location of tectonic plates boundaries plays a significant role in determining seismicity of the study area as it makes the region “the planet’s most active earthquake hotspot; as the plates collide, stress builds up in the fault zones where the plates meet” and ultimately a sudden and rapid releases of seismic stress causes large earthquakes (Naranjo, 2008). Earthquakes occur along faults. Faults are the area ruptured by an earthquake and it may or may not show up on the surface of the earth. To assess the influence of geological faults located in the study region on the earthquakes distribution pattern, the distance of earthquake locations to faults could serve as additional information (covariates) in modelling the point pattern. For that purpose a published map of active geological faults was obtained from the Geological Survey of Pakistan, which however was not already geo-referenced. A geo-referenced Tectonic map of Pakistan (figure 2.5) was used to geo-reference the geological faults map (figure 2.6). This was done in ERDAS Imagine Software by selecting 10 Ground Control Points from both images carefully keeping the RMSE to the minimum possible level. Using the selected GCPs the raw image was re-sampled to get the geo-referenced geological map.

Figure 2.5: Geo-referenced Tectonic map of Pakistan

Page 19: Implementation of Strauss Point Process Model to ... · section of tissue, cases of some disease in a geographical region, collection of forest fire locations etc. Analysis of point

IMPLEMENTATION OF STRAUSS POINT PROCESS MODEL TO EARTHQUAKE DATA

10

Figure 2.6: Geo-referenced geological map of the northern Pakistan

From the geological map (figure 2.6), study area of the earthquakes data was extracted using its bounding coordinates and the faults within the study area were digitized using ARG GIS software. To determine the location of plate boundaries, the map (figure 2.7) was downloaded from the USGS website (http://www.usgs.gov) which was also geo-referenced using ERDAS Imagine software.

Figure 2.7: Plate boundary location

Plate Boundary

Page 20: Implementation of Strauss Point Process Model to ... · section of tissue, cases of some disease in a geographical region, collection of forest fire locations etc. Analysis of point

IMPLEMENTATION OF STRAUSS POINT PROCESS MODEL TO EARTHQUAKE DATA

11

The location of plate boundary was also digitized in ARC GIS. Figure 2.8 shows the digitized locations of active faults and tectonic plate boundary along with the locations of earthquake point pattern.

Figure 2.8: Study area, locations of plate boundary and tectonic faults along with epticenters

From the figure 2.8, we see clear indication of the spread of earthquakes aftershocks along the plate boundaries with a dense cluster of aftershocks near the point where the two boundaries converge. Thus the location of plate boundaries can possibly serve as an important factor contributing to the distribution pattern of the earthquakes. To evaluate the contribution of plate boundaries location, a pixel image of the shortest distance of each pixel from the pate boundary was obtained using ILWIS software. Pixel size of the image (resolution) was set to be 20 m; it was the maximum resolution which could be calculated as pixel size less than 20 m required a large computation time and storage space. Given the extent of the study area, this pixel size is reasonable and a higher resolution was not really necessary. The distance image was imported to the spatstat package (Baddeley and Turner, 2005) and is as shown in figure 2.9.

Page 21: Implementation of Strauss Point Process Model to ... · section of tissue, cases of some disease in a geographical region, collection of forest fire locations etc. Analysis of point

IMPLEMENTATION OF STRAUSS POINT PROCESS MODEL TO EARTHQUAKE DATA

12

Figure 2.9: Pixel image showing distances from plate boundary

A subset (figure 2.10) of the above image was obtained using the boundary coordinates of the study region in R.

Figure 2.10: Subset of pixel image for the study area

Page 22: Implementation of Strauss Point Process Model to ... · section of tissue, cases of some disease in a geographical region, collection of forest fire locations etc. Analysis of point

IMPLEMENTATION OF STRAUSS POINT PROCESS MODEL TO EARTHQUAKE DATA

13

Similarly, to test the effect of active faults in the study area on the earthquake point pattern, distance of each earthquake location was calculated from the nearest fault and shown as a pixel image in figure 2.11.

Figure 2.11: Pixel image of the study area showing distances from the Geological faults

2.3. Data Description

Data within the study area were re-examined to discover its main characteristics which could be helpful for further study and modelling of the data. For this purpose different variables associated with the earthquakes given along with the coordinates of the epicentre locations were investigated.

2.3.1. Depth of hypocentre

An earthquake hypocenter is the three dimensional point in the earth where the rupture of an earthquake begins. For large earthquakes, the ruptures may extend up to several kilometres, and the hypocenter may be anywhere along the rupture. The epicentre of an earthquake event is the point location on the surface of the globe that represents the projection of the hypocenter onto the surface of the globe. For assessing the overall distribution of the earthquakes hypocentres below the surface, a graph of hypocentre depths (figure 2.12) was plotted which depicts that most of the earthquakes, with the exception of only 9 earthquakes, occurred at the depth of 10 Km below the earth’s surface, hence the depth measurement for almost all the earthquakes is constant.

Page 23: Implementation of Strauss Point Process Model to ... · section of tissue, cases of some disease in a geographical region, collection of forest fire locations etc. Analysis of point

IMPLEMENTATION OF STRAUSS POINT PROCESS MODEL TO EARTHQUAKE DATA

14

Orientation of each aftershock w.r.t. the previous aftershock

-180-160-140-120-100-80-60-40-20

020406080

100120140160180

1 21 41 61 81 101 121 141 161 181 201 221 241 261 281

aftershock number

ang

le

Figure 2.12: Depths of hypocenters

2.3.2. Orientation of earthquake epicentres

2.3.2.1. With respect to previous earthquake

To detect the direction of possible spatial path taken by the respective earthquakes with respect to theirs occurrence time, the angles of the locations of each earthquake with respect to the previously occurred earthquake were calculated and plotted in the figure 2.13.

Figure 2.13: Angle of occurrence for each earthquake w.r.t. previous

Depth of hypocenters

Page 24: Implementation of Strauss Point Process Model to ... · section of tissue, cases of some disease in a geographical region, collection of forest fire locations etc. Analysis of point

IMPLEMENTATION OF STRAUSS POINT PROCESS MODEL TO EARTHQUAKE DATA

15

Figure 2.13 shows that pattern of angles looks quite random and hence no specific direction could be detected with respect to the time of the aftershocks occurrences. To further explore the major characteristics of orientations data of the aftershocks, tools and techniques presented by circular statistics were utilized. For this purpose, orientation angles of all the aftershocks locations with respect to the epicentre of the previous aftershock were considered first. To get a more clear indication of the main trend of angles calculated for each earthquake with respect to the previous earthquake, Rose diagram was plotted (figure 2.14). Rose diagram, a circular histogram, is the most commonly used method for displaying circular data in which the circumference of a circle is split into groups and each group is represented as a sector of the circle in such a way that the area of a sector is proportional to the number of observations (frequency) of that particular group. Rose diagram serves as a more effective tool to display a circular data in graphical form to get an initial idea of its basic characteristics. Visual inspection of the rose diagram can give us indication about the data distribution; uniform/isotropic, unimodal, or bimodal/multimodal, and can suggest suitable models, such as von Mises model, for the data. For the mathematical details about the circular distributions e.g. uniform and von Mises Distribution, see (Fisher, 1995).

Figure 2.14: Angle of each earthquake epicentre w.r.t. the previous

The above plotted Rose diagram for the orientation data does not show any specific trend taken by the aftershocks.

Page 25: Implementation of Strauss Point Process Model to ... · section of tissue, cases of some disease in a geographical region, collection of forest fire locations etc. Analysis of point

IMPLEMENTATION OF STRAUSS POINT PROCESS MODEL TO EARTHQUAKE DATA

16

Summary Characteristics The two basic summary characteristics for circular data are their mean direction and mean resultant length. The mean resultant length is a measure of dispersion for circular data. For calculating the mean resultant length each observation is treated as a unit vector, or a point on the unit circle. The resultant vector of the observations is found and the length of the resultant vector is divided by the sample size. Value of mean resultant length thus, ranges from 0 to 1. A value close to 0 indicates that the observations are widely distributed in all directions whereas a value close to 1 indicates that the points are clustered close together. For the circular data of earthquake plotted above the mean direction and the mean resultant length were calculated as given

Mean direction= o28343.1cossin =′−=

ΣΣ

θθ

Mean resultant length=02.0

The mean resultant length is very close to 0, indicating that the directions taken by each earthquake w.r.t. the previous earthquake are all distributed randomly in all directions and hence the no significant trend in space could be found for earthquake aftershocks when plotted according to their time of occurrence making angle with respect to the previous aftershock. To test the data for showing some trend against the null hypothesis of uniform distribution, Watson’s test of uniformity was applied which did not reject the null hypothesis with P-value < 0.025. Hence we concluded that the earthquake data tend to follow the circular uniform distribution.

2.3.2.2. With respect to the major earthquake

Next, the orientations of earthquakes epicentres locations with respect to the epicentre of the main shock of Kashmir earthquake were determined. Figure 2.15 shows the data plotted on the circumference of a unit circle and Rose diagram for the angles of aftershocks locations with respect to the main earthquake.

Figure 2.15: Circular data plot and Rose diagram

Page 26: Implementation of Strauss Point Process Model to ... · section of tissue, cases of some disease in a geographical region, collection of forest fire locations etc. Analysis of point

IMPLEMENTATION OF STRAUSS POINT PROCESS MODEL TO EARTHQUAKE DATA

17

Figure 2.15 suggests that the data are not distributed uniformly along the circumference of the circle as

it shows high concentration between oo 190130 − (northwest of the epicentre of the main shock),

another cluster of points occurs between 260 oo 310− . The diagram also shows that the data are most

likely to be bimodal and therefore the Von Mises distribution (which is symmetric unimodal distribution analogous to the normal distribution for linear data) may not be suitable for this data. Summary Characteristics For the above data, we find that

Mean Direction = 69.2 ′ = o170 The mean direction shows that most of the earthquakes are spread around angle of 170 degrees. Variance of circular data for the earthquakes is calculated as Mean Resultant Length = 0.65 The above calculated mean and variance values of the earthquakes indicate a clustering behaviour of the aftershock in the northwest direction of the main shock of Kashmir earthquake. For testing the hypotheses of goodness of fit for uniform distribution and Von-Mises distribution, Watson’s tests were applied which strongly rejected the null hypotheses with P-values< 0.01 in both cases.

2.3.2.3. With respect to plate boundary

From figure 2.8 we observe that most of the earthquakes are closely concentrated around the point where two plate boundaries are converging, hence this convergence points seems to be highly determinant for the earthquake point pattern. To test the main characteristics of the earthquake data considered relative to this point, the angles of all earthquakes were calculated with respect to it and plotted as below

Figure 2.16: Circular data plot and Rose diagram

Page 27: Implementation of Strauss Point Process Model to ... · section of tissue, cases of some disease in a geographical region, collection of forest fire locations etc. Analysis of point

IMPLEMENTATION OF STRAUSS POINT PROCESS MODEL TO EARTHQUAKE DATA

18

The above Rose diagram also suggests that the data is multimodal and hence does not come from a uniform or von Mises distribution; the Watson’s tests also provided proof against these distributions. The mean direction and the mean resultant length for the data were calculated to be 270 degrees and 0.70 respectively. The mean direction and variance suggest a strong clustering of earthquakes around south of the point of convergence of two plate boundaries.

Page 28: Implementation of Strauss Point Process Model to ... · section of tissue, cases of some disease in a geographical region, collection of forest fire locations etc. Analysis of point

IMPLEMENTATION OF STRAUSS POINT PROCESS MODEL TO EARTHQUAKE DATA

19

3. Spatial Point Patterns

3.1. Introduction

By spatial point pattern we mean “data in the form of a set of points, irregularly distributed within a region of space”(Diggle, 1979). By ‘region’ we usually refer to two dimensional spaces although the idea is not confined to two dimensions only. The pattern is presumed to be formed as a result of some form of stochastic mechanism (spatial point process). It is the property of spatial randomness that distinguishes spatial point pattern data from geo-statistical and lattice data (Schabenberger and Pierce, 2002). Often the point pattern contains additional information attached to each point along with its coordinates. Such additional information is called as marks of the points and the point pattern is termed a marked point pattern. Analysis of a marked point pattern provides better and deeper understanding of the phenomena causing the specific pattern than an unmarked point pattern. When the marks associated with each point of the pattern can be classified as one type or another, the pattern is called as multi-type Marked Point pattern. A multi-type point pattern data poses a higher degree of complexity and hence is more difficult to analyze as compared to non-labelled data (Illian, 2008). Two important concepts in the point pattern theory are stationarity and isotropy, simply meaning that the point process is invariant under translation and rotation.

3.2. Spatial point process

A spatial point process is a stochastic mechanism that generates a spatial point pattern which we may observe (Yang et al., 2007). A point process and a point pattern are two different concepts in the sense that the point process is a stochastic model and the point pattern is a realization of the process (Isham, 1984) (Perry et al., 2006). Various researchers have put extensive contributions in defining different types of point processes and providing theoretical understandings of the processes (Baddeley and Silverman, 1984) (Cox and Isham, 1980) (Diggle, 2003) (Hogmander and Sarkka, 1999) (Mateu and Gregori) (Mateu and Montenegro) (Moller and Waagepetersen, 2007) (Vere-Jones, 2006) Spatial point patterns are the results of a mixture of first order effects and second order effects (Yang et al., 2007) of a point process. First order and second order effects of a point pattern are analogous to the mean and variance of a regular probability distribution (Perry et al., 2006). Thus a first order effect refers to the average number of events per unit area- of the point pattern; whereas the second order effects refer to the variability in the numbers of events per unit area for the whole point pattern. First order effects in a point pattern are described by the intensity which can be homogeneous or may vary from location to location (inhomogeneous) in the study area. Mathematically the intensity is defined as:

( )

=→ du

dunEu

du

))((lim

0λ 3.1

Page 29: Implementation of Strauss Point Process Model to ... · section of tissue, cases of some disease in a geographical region, collection of forest fire locations etc. Analysis of point

IMPLEMENTATION OF STRAUSS POINT PROCESS MODEL TO EARTHQUAKE DATA

20

where )(ud is a small region around the point u , ))(( udn is the number of points in the region

),(ud ))((( udnE is the expected number of events in this small region and du is the area of the

small region )(ud .

Investigation of the intensity gives us a first insight into the distribution of the earthquake data of a point pattern. There are different parametric and non-parametric techniques used for the purpose of estimating inhomogeneous intensity (Baddeley, 2008). The second order effect of a point pattern involves the covariance (correlation) between number of events in pairs of sub-regions within R (Diggle, 2003). Mathematically the second order effect is defined as:

( )

=ji

ji

duduji

dudu

dundunEuu

ji

))()((lim,

,γ 3.2

with notations similar to those described above. The second order properties of a spatial point process describe variation in the relative frequency of pairs of earthquakes as a function of their position; under the assumption of stationarity (homogeneous intensity), however, this function depends only on the relative positions of the two earthquakes and under the further assumption of isotropy, it reduces to a function of distance only (Mateu et al., 1998). Thus the second order properties specified by the pair correlation function (the distribution of inter-earthquake locations distances) may be assessed using the classical summary statistics based on the inter-earthquake location distances such as K, F, G and L-functions. (see (Baddeley et al., 2008) (Diggle, 2003) (Moller and Waagepetersen, 2007) for detailed description of these functions)

3.3. Conditional Intensity

Mostly point processes are defined in terms of their conditional intensity also called Papangelou

Conditional Intensity ( )x,uλ which is defined as the conditional probability of the occurrence of an

event of the process x at point u given the rest of the point process. It is related to the probability

density by eq. 3.3

( ) { }( )( )x

xx,

f

ufu

∪=λ , for x∉u 3.3

In other words, the conditional intensity is the ratio of the probability densities for the point process x

with and without the point u added (Baddeley et al., 2008).

For pair-wise interaction processes, the conditional intensity is of the form

Page 30: Implementation of Strauss Point Process Model to ... · section of tissue, cases of some disease in a geographical region, collection of forest fire locations etc. Analysis of point

IMPLEMENTATION OF STRAUSS POINT PROCESS MODEL TO EARTHQUAKE DATA

21

( ) ∏=

=)x(

1

),()(x,n

iixucubuλ 3.4

In practice the conditional intensity is often specified through a log-linear regression model with two components:

)x ,()()x ,(log 21 uCuBu θθλθ += 3.5

where ( )21 θθθ += are the parameters to be estimated. The term ( )uB depends only on the spatial

location u of the earthquake so it represents spatial trend or spatial covariate effects. The interaction

term )x ,(uC depends on not only the earthquake location u , but also on the configuration (spatial

arrangement) of the rest of the process x, hence it represents stochastic interactions between the points.

The term )x ,(uC reduces to zero for the Poisson process.

3.4. Poisson Point Process

The Poisson point process, also termed as complete spatial randomness (CSR), is distinguished from other point processes by two important properties given as: 1. The number of events (points) in a pre-defined region W (with area |W|) follows the Poisson

distribution with a mean of λ|W|. 2. The locations of all the points inside W are an independently distributed according to the uniform

distribution (Diggle, 2003). The first property states that under CSR the intensity of events does not vary across the region, and the second implies that events do not exhibit any form of interaction meaning that occurrence of a point within the region does not in any way encourage or inhibit the occurrence of another event in its neighbourhood (Diggle, 2003). Thus “there is neither in-homogeneity of the underlying process nor interaction (attraction or repulsion) between the points, both of which lead to clustering or regularity in the realizations” (Isham, 1984).

The homogeneous Poisson process has constant conditional intensity ( ) λλ =x,u , whereas an

inhomogeneous Poisson process has conditional intensity ( ) ( )uu λλ =x, . Since the points in a Poisson

process are independent, the conditional intensity depends only on the location of an event and not on the configuration of the rest of the point pattern.

3.5. Nearest-Neighbor G-Function

As discussed earlier, in order to investigate the second order effects several distance distribution based methods have been described (Baddeley et al., 2008) (Diggle, 2003), which can be helpful in testing an observed pattern against the benchmark hypothesis of Complete Spatial Randomness (CSR). These functions indicate the nature of departure from CSR and thus are useful in determining the kind of interaction and interaction distances between points of a point pattern. The distance methods explore

Page 31: Implementation of Strauss Point Process Model to ... · section of tissue, cases of some disease in a geographical region, collection of forest fire locations etc. Analysis of point

IMPLEMENTATION OF STRAUSS POINT PROCESS MODEL TO EARTHQUAKE DATA

22

the pattern in the data by comparing graphically the observed nearest neighbour distribution functions with some hypothesized model of CSR. Nearest neighbour distance function known as G function is one instance of the distance functions which examines the distance of each point from its nearest point. Theoretically the G-function is given as:

{ }∑ ≤=i ii rtrxerG 1),()(

^

3.6

Where ),( rxe i is an edge correction weight introduced so that )(^

rG is unbiased,

{ }x:min ∈−= ≠ ijijii xxxt is the distance of each earthquake location to its nearest neighbour

andr is the radius of the disk cantered atix .

For a homogeneous Poisson point process of intensity λ , the nearest-neighbour distance distribution

function is given as:

)exp(1)( 2λπ−−=rGpois 3.7

The values of )(^

rGG pois> means that the observed nearest neighbour distances are larger than those

expected from a completely random point pattern thus indicates clustering in earthquake locations

pattern, whereas )(^

rGG pois< indicates inhibition among earthquakes since the observed distances are

found to be smaller than those expected from a Poisson process. There are extensions to the distance functions for multi-type point patterns which are used to detect the interactions between points of different types.

3.6. Gibbs Models

If the earthquakes tend to show any kind of interaction, they are thought of having a Markov property and the earthquake data can be considered as a Markov point process or more generally Gibbs point process which assumes symmetric interactions. The Markov property states that the conditional rate of the process at a spatial location u , given the complete realization of the process depends only on those

locations in the realization which are neighbours of u (Isham, 1984). (Moller and Waagepetersen,

2007). Thus the Markov point processes are characterized by having finite number of neighbouring points in a bounded region observing symmetric interactions. The points are neighbouring if they are within some critical distance of each other. Since the analysis of an interacting earthquake locations data is aimed at modelling not only the spatial locations of the earthquakes but also the attached information to each location given as marks, the marked Gibbs point processes are the suitable choice. Furthermore, when the earthquake data consists of several types of interacting earthquakes, the multi-type Gibbs models are the extension to the marked Gibbs model. Pair-wise interaction models are a special case of Markov point processes introduced by (Ripley, 1977). In fact most Gibbs models are pair wise interaction models. The Multi-type Gibbs models with pair wise interactions have probability densities of the form:

Page 32: Implementation of Strauss Point Process Model to ... · section of tissue, cases of some disease in a geographical region, collection of forest fire locations etc. Analysis of point

IMPLEMENTATION OF STRAUSS POINT PROCESS MODEL TO EARTHQUAKE DATA

23

)(xf = α

∏∏<= ji

jimm

n

i

im xxcxbjii

),()( ,

)x(

1

3.8

where α is a normalizing constant; )(ubm , Wu ∈ is the first order term for earthquakes of each type

and ),(,

vucji mm

, Wvu ∈, is the second order or pair wise interaction term for the earthquakes of

types im and jm .

The major problem with Gibbs models is that it contains an unknown scaling factor α which is intractable and hence the parameters can not be estimated using some optimization technique such as the likelihood method; an alternate estimating strategy known as maximum pseudo-likelihood estimation is developed which does not involve any unknown scaling factor and hence it is easier to obtain estimates of the parameters.

3.7. Strauss Point Process

Strauss point process (Strauss, 1975) (Kelly and Ripley, 1976) is an example of the pair wise interaction models in which each single point contributes the same interaction function to the density (3.6) irrespective of its position, and each pair of distinct points, which are not more than ‘r ’ apart

and are thus defined to be neighbor, contributes a constant interaction=γ . Based on the above

definition, the Strauss process is defined by the density given as:

)x()x()( snxf γαβ= 3.9

where α is the normalizing constant, β is the intensity of the process and )x(s is the number of pairs

of distinct points in x which are not more than r units apart. The parameter γ controls the strength of

interaction between points. If γ =0 the model is a hard core process. For 0<γ <1, the process exhibits

inhibition (repulsion) between points. If γ =1 the model reduces to a Poisson process with intensityβ .

For γ >1, the density is not integrable. Originally the Strauss model was proposed as a model for

clustering when γ >1 but later it turned out to be a model for inhibition and is defined only for

0<γ <1 (Turner, 2007).

The conditional intensity of the Strauss process is given by equ 3.10 obtained from equ 3.5 by setting

1)(,log,log 21 === uBγθβθ and )x ,()x ,( utuC =

)x,()x,( utu βγλ = 3.10

where { } )x()x()x,( susut −∪= is the number of points of x that lie within distance r of the

location u .

The figure 3.1 shows some simulated realizations of a simple Strauss model in a square region of 10 units for different values of intensity, interaction parameter and interaction radius given as

Strauss ),,( rγβ .

Page 33: Implementation of Strauss Point Process Model to ... · section of tissue, cases of some disease in a geographical region, collection of forest fire locations etc. Analysis of point

IMPLEMENTATION OF STRAUSS POINT PROCESS MODEL TO EARTHQUAKE DATA

24

Effect of different intensity values Effect of different interaction parameters Effect of different interaction radii

Figure 3.1: Simulations of Strauss model

Page 34: Implementation of Strauss Point Process Model to ... · section of tissue, cases of some disease in a geographical region, collection of forest fire locations etc. Analysis of point

IMPLEMENTATION OF STRAUSS POINT PROCESS MODEL TO EARTHQUAKE DATA

25

3.8. Multi-type Marked Point Pattern and Strauss Model

An extension to the Strauss model takes into account the clustering behaviour for a multi-type marked point pattern. This extended model is known as a multi-type Strauss model and assumes pair-wise interactions in which interaction depends not only on the spatial locations but also on the type of the points (Isham, 1984). “In contrast to the unmarked Strauss point process, the multi-type Strauss model allows some of the interaction parameters to exceed 1, provided one of the relevant types is a hard core” (Baddeley et al., 2008). The hard core distance between the earthquakes determines the level of clustering in the data. “The weaker the inhibitions are between points of the same type, the weaker are the allowable attractions between points of different types” (Isham, 1984). For multi-type Strauss Model the second order interaction term in eq. 3.8 given as:

≤−>−

=′′

′′

mmmm

mmmm rvuif

rvuifvuc

,,

,,

1),(

γ 3.11

where 0, >′mmr are the interaction radii and 0, ≥′mmγ are the interaction parameters.

Thus the conditional intensity defined as in eq. 3.8 for a multi-type Strauss process is given by

( )( ) ( )∏=j

utjiijiiu

x,,,x,, γβλ

3.12

where ( )x,, ut ji is the number of earthquakes in the marked point pattern of earthquake locations with

mark equal to j , lying within a distance jir , of the location u and ji,γ are the interaction parameters

for the pairs of points of different types (Baddeley et al., 2008).

3.9. Model Selection

The Akaike Information Criterion (AIC) is most widely used as a measure to quantify the performance of different models fitted to a fixed data set. It is defined as AIC= -2 max (log-likelihood) + (number of parameters) The model with smaller AIC is considered to be a better fit to the data (Akaike, 1974)

Page 35: Implementation of Strauss Point Process Model to ... · section of tissue, cases of some disease in a geographical region, collection of forest fire locations etc. Analysis of point

IMPLEMENTATION OF STRAUSS POINT PROCESS MODEL TO EARTHQUAKE DATA

26

Earthquake Data(1973 – 2008)

Exploratory Data Analysis

Study Area Selection

Data Description

Characterization of Earthquake Point

Pattern

First Order Characteristics

Second Order Characteristics

Straus Model Fitting

AICInclusion of Covariates (Geological Information)

Final Model

Interpretation

4. Methodology

The methodology adopted for the purpose of analysis and modelling of the earthquake data is shown in figure 4.1:

Figure 4.1: Methodology adopted for analysis and model fitting of earthquake data

Page 36: Implementation of Strauss Point Process Model to ... · section of tissue, cases of some disease in a geographical region, collection of forest fire locations etc. Analysis of point

IMPLEMENTATION OF STRAUSS POINT PROCESS MODEL TO EARTHQUAKE DATA

27

For the basic exploratory analysis of the earthquake data SSLib package of R software was used, CircStats package was utilized for the analysis of the data converted into circular data and for the further analysis and modelling of the earthquake data considering it as a point pattern, Spatstat functionalities were utilized. The data set was obtained form the USGS website (www.USGS.gov) which consisted of all the earthquakes which occurred in the northern area of Pakistan since 1973. The data comprised of different variables associated with the earthquake occurrences like the epicentre location, magnitude, depth and occurrence time of each earthquake. Coordinates of the data were converted from polar coordinates to meters for ease of computation. Various aspects of the data were investigated to discover prominent characteristics of the data. Based on the exploratory and visual data analysis, the

selection of the study area was made and the earthquakes of magnitudes4≥ were selected for further study. Geological maps of the study area were obtained from different sources to get deeper insight in to the earthquake data. The maps were geo-referenced and the plate boundaries and tectonic faults located with in the study region were digitized using GIS software. The selected study area and data set were again analyzed in different aspects to discover prominent features of the earthquakes occurring in the study area. The analysis included the study of depth of earthquakes epicentres, occurrence time of the events, angle of occurrence of each earthquake with respect to some geographically important locations in the study area and also the angles of each event with respect to the previous event were calculated and plotted to see if they showed any specific trend in space. For calculating angles of each event with respect to specific location MATLAB functions were utilized and for the analysis of circular data thus obtained CircStats package of R software was exploited. he data set were converted in to a multi-type marked Point pattern by defining some threshold values to categorize the earthquakes into different types and its first order and second order effects were investigated in order to characterize the point pattern. For estimating first order characteristics kernel smoothing technique was applied, where as for testing second order effects nearest neighbour G- functions were calculated. Based on the characterization, a Strauss model was fitted using the information about spatial coordinates and marks of the points. Goodness of fit of the model was tested using simulated envelopes. A simulated realization was also plotted to have a general impression of result of the fitted model. AIC value was calculated to check the relative suitability of the model against the models to be fitted later involving more explanatory variables To improve the model fitting so that the data set can be best described in terms of covariate effects, Different prominent geological features including the locations of plate boundaries and faults of the study area were also considered as possible covariates and their effects were incorporated in the Strauss model one-by-one. At each stage AIC of the fitted model was calculates to determine improvement in modelling and to quantify the effects of the covariates on the earthquake point pattern.

Page 37: Implementation of Strauss Point Process Model to ... · section of tissue, cases of some disease in a geographical region, collection of forest fire locations etc. Analysis of point

IMPLEMENTATION OF STRAUSS POINT PROCESS MODEL TO EARTHQUAKE DATA

28

5. Results

For the purpose of point pattern analysis of the earthquake data, the data were converted to a multi-type point pattern by classifying the earthquakes as ‘Small’ or ‘Large’ according to their intensity. Earthquakes with magnitudes between 4 and 5.4 (both inclusive) were classified as ‘Small’ and magnitudes more than 5.4 as ‘Large’. The following figure shows spatial location of the different types of the earthquakes as described:

Figure 5.1: Distribution of small and large earthquakes

To get an impression of local variability in the spread of earthquakes their intensity was estimated and plotted using kernel smoothing technique (Baddeley, 2008) (Turner, 2007) in figure 5.2. Bandwidth for the kernel was selected to be 10% of the diameter of a square with area equal to the area of study region.

Page 38: Implementation of Strauss Point Process Model to ... · section of tissue, cases of some disease in a geographical region, collection of forest fire locations etc. Analysis of point

IMPLEMENTATION OF STRAUSS POINT PROCESS MODEL TO EARTHQUAKE DATA

29

Figure 5.2: Intensity of small and large earthquakes The figure 5.2 shows that intensity of earthquake locations is not homogeneous throughout the study region as it shows a high concentration of earthquakes occurrence at the north western part of the study area making it a hot- spot for epicentre locations. For further characterization of the earthquake data, the second order effects were investigated using distance based G- functions (Gcross in spatstat) for multi-type marked patterns.

Figure 5.3: Nearest neighbour G function for pairs of small earthquakes

Figure 5.3 shows the cumulative distribution of nearest- neighbour distance of each ‘Small’ earthquake from an earthquake of the same type. Since the entire estimated curve for nearest

Page 39: Implementation of Strauss Point Process Model to ... · section of tissue, cases of some disease in a geographical region, collection of forest fire locations etc. Analysis of point

IMPLEMENTATION OF STRAUSS POINT PROCESS MODEL TO EARTHQUAKE DATA

30

neighbour distances lies far above the theoretical curve of Poisson process, the ‘Small’ earthquakes clearly show a clustered pattern. The maximum interaction radius is shown as 4 km which is an evidence of very dense clusters of ‘Small’ earthquakes within a short distance. Another notable thing is the nugget effect (more than one point at a same location i.e. at 0 distances) of the curve which is due to many coincident points in the data. The clustering pattern of the data according to the curve starts after 1km.

Figure 5.4: Nearest neighbour G function for pairs of small and large earthquakes

Figure 5.4 suggests clustering of ‘Large’ earthquakes around the ‘Small’ earthquake within distance of

20km. A close inspection of the curve reveals that)(^

rG =0 for about 1≤r km indicating that there

smallest nearest neighbour distance of ‘Large’ earthquakes from a ‘Small’ one is 1 km thus there is no

‘Large’ earthquake within 1km of a ‘Small’ earthquake.

Figure 5.5: Nearest neighbour G function for pairs of large and small earthquakes

Page 40: Implementation of Strauss Point Process Model to ... · section of tissue, cases of some disease in a geographical region, collection of forest fire locations etc. Analysis of point

IMPLEMENTATION OF STRAUSS POINT PROCESS MODEL TO EARTHQUAKE DATA

31

Figure 5.5 reveals the observed cumulative distribution pattern of the distance from ‘Large’ earthquakes to its nearest neighbour earthquake of type ‘Small’. From the figure we again find a nugget effect of the curve with in distance of 1 km which is due to many duplicated point in the point pattern, and see that ‘Small’ earthquakes are highly clustered around the ‘Large’ earthquakes within relatively short distance of about 3km.

Figure 5.6: Nearest neighbour G function for pairs of large earthquakes

Figure 5.6 suggests a clustered pattern of ‘Large’ earthquakes as the nearest neighbour distances of

‘Large’ earthquakes are longer than for the Poisson process i.e. )()(^

rGrG pois> . However careful

investigation of the plot reveals that the ‘Large’ earthquakes are not very tightly clustered together since the shortest nearest neighbour distance of the ‘Large’ earthquake is more than about 3km and there is no ‘Large’ earthquake within 3 km of another earthquake of the same type.

30)(^

<∨= rrG 5.1

From the above figures we get a clear indication of clustering in our earthquake data. Secondly from the plots, we detected that there are some identical earthquakes in our data (having the same geographical location i.e. x, and y coordinates). The data were investigated and it was found that there were in total 21 coincident earthquakes at 10 different locations. The presence of coincident earthquake location violates the basic assumption of point pattern analysis that all the points should be distinct. To circumvent this violation, there could be two possible solutions: one could be to eliminate the points of relatively less importance (earthquakes with less magnitude in our case) from the coincident locations and keep only the ones with highest values of marks (magnitudes) for each location. The second could be to wiggle the data. By ‘wiggling’ we mean to introduce random disturbances to the coincident locations according to the uniform density. Both the solutions involve data manipulation which is forbidden in research as stated by (Turner, 2007) about the same situation: “This is of course a bad statistical practice - one should not modify the data to fit one’s theory! On the other hand it’s either that or develop a whole new theory, a rather Herculean task.” In the current

Page 41: Implementation of Strauss Point Process Model to ... · section of tissue, cases of some disease in a geographical region, collection of forest fire locations etc. Analysis of point

IMPLEMENTATION OF STRAUSS POINT PROCESS MODEL TO EARTHQUAKE DATA

32

situation there is no better solution than that, also since there are only few earthquakes which need to be manipulated and not the whole data set. Additionally, from the G-functions we find that there are no other earthquakes within about 1 km distance of the coincident points. Given the total area of the study region which is about 16000 km, disturbing a few earthquake locations from their original positions within 0.5 kilometres is not expected to significantly affect the data characteristics. Therefore we chose for keeping the points with highest magnitudes at their original positions but wiggling their identical points within 0.5 km of their locations. This was done by introducing random disturbances to both coordinates of the identical points according to uniform distribution. The G functions were again calculated for the modified data and the results are in the figure 5.7.

Figure 5.7: Nearest neighbour G functions for pairs of points of all types

Page 42: Implementation of Strauss Point Process Model to ... · section of tissue, cases of some disease in a geographical region, collection of forest fire locations etc. Analysis of point

IMPLEMENTATION OF STRAUSS POINT PROCESS MODEL TO EARTHQUAKE DATA

33

The plotted G-functions of the observed point pattern against their expected values show that the above described strategy did not influence the results and the overall pattern except that the nugget effect is now removed from the data. As the above calculated G functions reveal a clustering behaviour of the earthquake data and also Gcross (Large, Large) shows a hard core distance of about 3 km, the multi-type Strauss model seems appropriate for the data as it takes into account the interaction type resulting in aggregation of the multi-type point pattern. Estimation of interaction radii between the earthquakes of different types in order to fit the Multi-type Strauss model however remains unclear. According to Baddeley: “There is no ‘correct’ way to determine the interaction radii for the multi-type Strauss model, but the most popular way is to guess an initial value by inspecting the plots of the G functions between each pair of types….The plots can give an impression of what would be a ‘reasonable’ interaction distance (and also an impression of whether a multi-type Strauss model is appropriate for the data)”. From the visual inspection of the above plotted G functions, and several iterations of the multi-Strauss model for different values of r the following matrix for the interaction radii was determined.

=

2000010000

100001000, jir 5.2

Note that the matrix for interaction radii must be symmetric as Gibbs models assume symmetric

interactions between points of two different types, meaning that if a point ‘x ’ interacts with a point

‘ y ’ within a fixed distance ‘r ’ , ‘ y ’ also interacts within the same distance with the point x . Although G functions were visually analyzed to get some idea of the interaction distances, radii for the point pairs had to be adjusted in order to bring the matrix to symmetric form. Plotting a fitted model is a necessary step in order to evaluate how adequately a model fits to the data. However when it was tried to plot the different fitted Strauss models MultiStrauss (c (“Small”, “Large”)), an error (“data and model do not have the same possible levels of marks”) was encountered with the code “plot (fitin (model))” provided by spatstat package for the purpose. The bug was reported to the software developer and it was fixed and later notified that spatstat reads the levels of the marks in alphabetical order and therefore the above mentioned code had to be modified to make it compatible with the software by the following:

MultiStrauss (c ("aSmall","bLarge") After trying different Strauss models for different interaction terms, the model in which the intensity is log-quadratic function of Cartesian coordinates was selected to be the most appropriate fit because visually it gave the best results of all.

Model <- ppm (X, ~ polynom (x, y, 2), MultiStrauss (c ("aSmall", "bLarge"), r)) 5.3

with AIC equal to 9116.

Page 43: Implementation of Strauss Point Process Model to ... · section of tissue, cases of some disease in a geographical region, collection of forest fire locations etc. Analysis of point

IMPLEMENTATION OF STRAUSS POINT PROCESS MODEL TO EARTHQUAKE DATA

34

This model specifies fits the Strauss model to the earthquake data with estimated intensity )(ub

which is log-quadratic in the Cartesian coordinates. Mathematically stated,

yyxyxyxyxb 542

3210),(log ββββββ +++++=

The estimated values of the different parameters are given as:

0β̂ = -3.07572e+04

1̂β = 1.27755e-02

2β̂ = 1.49033e-02

3β̂ = -1.69290e-09

4β̂ = -3.03734e-09

5β̂ = -1.80869e-09

The estimated values of interaction parameters ji,γ̂ between the earthquakes of type ‘small’ and

‘large’ were obtained as

γ̂ (small, small) = 1.3698

γ̂ (small, large) = 1.1564

γ̂ (large, large) = 0.2823

The values for ji,γ more than 1 suggest clustering behaviour between the two types of earthquakes,

while the values close to zero suggest inhibition between the earthquakes. Figure 5.8 shows the fitted trends to the ‘Small’ and ‘Large’ earthquakes

Figure 5.8: Trends fitted by the model to small and large earthquakes

Page 44: Implementation of Strauss Point Process Model to ... · section of tissue, cases of some disease in a geographical region, collection of forest fire locations etc. Analysis of point

IMPLEMENTATION OF STRAUSS POINT PROCESS MODEL TO EARTHQUAKE DATA

35

A simulated realization of the fitted point process model was calculated using the Metropolis-Hastings algorithm which showed the following point pattern:

Figure 5.9: A simulated realization of the fitted model

Goodness-of-fit test for any fitted point process model can be carried out with Monte Carlo test using any of the G, F or K nearest neighbour summary functions. The test involves generating data as a realization of any fitted model serving as a null hypothesis and each time computing the nearest neighbour summary statistic for the original data points (Baddeley et al., 2008) (Diggle, 2003). Thus we calculate lower and upper envelopes of the statistic calculated for 99 simulated realizations of the point pattern under the fitted model and the calculating the same nearest neighbour summary function for the data. If the function calculated for the data lies between the lower and upper envelopes, we can conclude that the model fits adequately to point data.

Figure 5.10: Simulated envelopes for the fitted model

Page 45: Implementation of Strauss Point Process Model to ... · section of tissue, cases of some disease in a geographical region, collection of forest fire locations etc. Analysis of point

IMPLEMENTATION OF STRAUSS POINT PROCESS MODEL TO EARTHQUAKE DATA

36

Figure 5.10 shows the computed envelopes for our fitted Multi-Strauss model and the nearest neighbour distance function calculated for the point pattern. From the figures we find that the curves for the nearest neighbour summary functions strays a little beyond the upper simulated envelopes which indicates that there is still some variability in the spatial intensity of the data that our fitted model was unable to capture. The un-captured variability may be due to some covariates effects responsible in the formation of the point pattern which needs to be considered in modelling the pattern. A further attempt in improving the point pattern modelling involved using the distance-to-the-plate boundaries pixel image (figure 2.10) as an explanatory variable. Since the image contained zero distance values for the pixels which lie at the plate boundaries, it caused numerical problems in calculating the densities which is always some log function of the involved variables. Therefore to overcome the problem a minimum threshold value equal to the spatial resolution of the pixel image was taken for the image. Several different functions were checked to model the densities in terms of its spatial locations and the distance to the plate boundaries variable (DP) out of which the following model was selected based on the AIC value obtained. Model <- ppm (X, ~ DP + polynom (x, y, 2), covariates=z, MultiStrauss (c ("aSmall", "bLarge"), r)) 5.4

with AIC equal to 9113 The above model states that the intensity of the fitted model is the function of interaction of the distance and the Cartesian coordinates; or more specifically, the density is a log quadratic function of the Cartesian coordinates, multiplied by a constant factor depending on the distance to the plate boundaries i.e.,

265

243210)),,(log( yxyxyxDPDPyxb βββββββ ++++++= 5.5

Below are the estimated values of the parameters of the fitted intensity

0959859.1ˆ

0973838.2ˆ

0942556.1ˆ

0231911.1ˆ

0214348.1ˆ

0522520.3ˆ

0472467.2ˆ

6

5

4

3

2

1

0

−−=

−−=

−−=

−=

−=

−−=

+−=

e

e

e

e

e

e

e

β

β

β

β

β

β

β

Page 46: Implementation of Strauss Point Process Model to ... · section of tissue, cases of some disease in a geographical region, collection of forest fire locations etc. Analysis of point

IMPLEMENTATION OF STRAUSS POINT PROCESS MODEL TO EARTHQUAKE DATA

37

Following are the estimated interaction parameters ji,γ̂ for earthquakes of different types

γ̂ (small, small)=1.3486

γ̂ (small, large)= 1.1390

γ̂ (large, large)= 0.3098

Figure 5.11 shows the trends fitted by the model (eq. 5.4) for ‘small’ and ‘large’ earthquakes

Figure 5.11: Trends fitted by the model for small and large earthquakes

We next considered determining the effect of the presence of tectonic faults in the study region by incorporating of the distance-to-nearest-fault location image as another explanatory variable in the model along with the effect of Cartesian coordinates. Again to avoid numerical problems caused by taking logarithm of zero values contained in the image for the pixels lying on the faults line, a minimum threshold value equal to the spatial resolution of the pixel image was taken for the image. Below is the fitted Strauss model incorporating the distance-to-faults pixel image (DF)

Model <- ppm (X, ~ polynom (x, y, 2) + DF, covariates = list (DP = Dplate, DF = Dfault), MultiStrauss (c ("aSmall", "bLarge"), r)) 5.6

The model gave the AIC value equal to 9093. This model specifies the intensity function to be the interaction of the distance to faults locations and the Cartesian coordinates i.e.

265

243210)),,(log( yxyxyxDFDFyxb βββββββ ++++++= 5.7

Page 47: Implementation of Strauss Point Process Model to ... · section of tissue, cases of some disease in a geographical region, collection of forest fire locations etc. Analysis of point

IMPLEMENTATION OF STRAUSS POINT PROCESS MODEL TO EARTHQUAKE DATA

38

The estimated values for the parameters of the intensity function and the interaction parameters are were calculated as below:

09--1.93184eˆ

09--3.95884eˆ

09-2.49510e- ˆ

02-1.61834eˆ

02-1.68838eˆ

05-7.97228eˆ

04+-3.39585eˆ

6

5

4

3

2

1

0

=

=

=

=

=

=

=

β

β

β

β

β

β

β

γ̂ (small, small) = 1.2371

γ̂ (small, large) = 1.1265

γ̂ (large, large) = 0.3262

Figure 5.12 shows the fitted trend to ‘small’ and ‘large’ earthquakes

Figure 5.12: Trends fitted by the model to small and large earthquakes

Next step in model improvement involved incorporation of both the distance-to-nearest-fault location image (DF) and the distance-to-plate boundary (DP) to test the combined effect of both as explanatory variables alongwith the Cartesian coordinates. Several different functions were checked to model the intensity in terms of its spatial locations and the explanatory variables DF and DP, out of which the following model was selected based on the AIC value obtained.

Page 48: Implementation of Strauss Point Process Model to ... · section of tissue, cases of some disease in a geographical region, collection of forest fire locations etc. Analysis of point

IMPLEMENTATION OF STRAUSS POINT PROCESS MODEL TO EARTHQUAKE DATA

39

Model <- ppm (X, ~ DP + polynom (x, y, 2) + DF, covariates = list (DP = Dplate, DF = Dfault), MultiStrauss (c ("aSmall", "bLarge"), r)) 5.8

with an AIC equal to 9090. The model specifies the intensity function to be an interaction of the effects of Cartesian coordinates, distance of the earthquakes from the plate boundaries and distance of earthquakes from the fault location. Thus the intensity is the combined effect of all the explanatory variables. Mathematically stated,

276

2543210)),,,(log( yxyxyxDFDPDFDPyxb ββββββββ +++++++= 5.9

Estimated parameters of the fitted intensity are given below

0966392.1ˆ

0956565.3ˆ

0918361.2ˆ

0239951.1ˆ

0251587.1ˆ

0584108.7ˆ

0506126.3ˆ

0494687.2ˆ

7

6

5

4

3

2

1

0

−−=

−−=

−−=

−=

−=

−=

−−=

+−=

e

e

e

e

e

e

e

e

β

β

β

β

β

β

β

β

And the estimated values for the interaction parameters are as given below

γ̂ (small, small) = 1.2297

γ̂ (small, large) = 1.1128

γ̂ (large, large) = 0.3516

Model (eq. 5.8) was selected to be the best fitted model as it reduces the AIC value by 26 with the inclusion of only two more parameters as compared to the model (eq. 5.3)

Page 49: Implementation of Strauss Point Process Model to ... · section of tissue, cases of some disease in a geographical region, collection of forest fire locations etc. Analysis of point

IMPLEMENTATION OF STRAUSS POINT PROCESS MODEL TO EARTHQUAKE DATA

40

Figure 5.13 shows the fitted trend to ‘small’ and ‘large’ earthquakes

Figure 5.13: Trends fitted by the model to small and large earthquakes

Page 50: Implementation of Strauss Point Process Model to ... · section of tissue, cases of some disease in a geographical region, collection of forest fire locations etc. Analysis of point

IMPLEMENTATION OF STRAUSS POINT PROCESS MODEL TO EARTHQUAKE DATA

41

6. Discussion

For the present study the potential of Strauss point process was investigated for capturing the variability in the point pattern consisting of locations and magnitudes of the Kashmir earthquake 2005 since the model allows the use of covariates for explaining spatial variations in the marked point pattern where the marks can be categorized as factors. The Strauss model proved to be very flexible and rigorous in modelling the clustered pattern of the earthquake data and thus it explains most of the variability of the earthquake locations distribution. With the inclusion of additional explanatory variables, the results of Strauss models fitting get more and more refined and hence it has the capacity to incorporate a variety of additional variables to define a marked point pattern such as earthquake epicentre locations data. The earthquake data serves as an example of marked point pattern, difficult to fully explain, since the variability or inhomogeneity in the pattern is caused by many observed and unobserved geological and geophysical factors. The large phenomena such as the earthquakes relate to subsurface physical processes of the dynamic earth which do not evolve in countable times. Generation of whole mechanism is spanned over a very long period and depends on a number of forces interacting in the earth’s crust. For a realistic model fitting, more and more geological and geophysical information about the mechanism is required to fully explain the process. In fact for the earthquake data, there are not many observable factors determining the earthquake locations as the earthquakes mechanism is all generated many kilometres deep down the earth’s surface. The phenomena happening deep down the earth’s surface may or may not manifest itself in the form of clear indications above the earth’s surface and this puts one into difficulty in searching for possible indications or covariates which could be associated with the spread of earthquake locations. Some of the geological factors considered as being highly associated with the earthquake locations were incorporated in the Strauss model and the results could be improved. The modelling approach suggested that the places close to the plate boundaries are associated with higher earthquake probability as can be observed from figure 2.8 that earthquakes are located along the plate boundaries and most of the earthquakes are clustered near the place where two boundaries are converging. The large clustering of the earthquakes at this location might be due to the fact that this location is under effect of two close-by plate boundaries. The geological reason for the spread of earthquakes epicentres along the plate boundaries has already been discussed in the section 2.2. Locations of active tectonic faults also proved to be a very significant determinant for the distribution of earthquake epicentre locations as incorporation of the information about the fault locations greatly improved the model fitting results. Finally when the effects of both the plate boundary locations and geological fault locations were considered as explanatory variables in the model fitting, the results were further improved in terms of the fitted intensity as can be seen from figure 5.13 and also in terms of the AIC when compared to previously fitted forms of the intensity function. If information about more geological or geological factors could be available and incorporated in the Strauss model fitting, the model is expected to be very adequately representative of the earthquake locations distribution in the study region.

Page 51: Implementation of Strauss Point Process Model to ... · section of tissue, cases of some disease in a geographical region, collection of forest fire locations etc. Analysis of point

IMPLEMENTATION OF STRAUSS POINT PROCESS MODEL TO EARTHQUAKE DATA

42

Despite the limitations and complexity of the data, some limitations of the Strauss model were also encountered during the study. Basically the model serves as a model of inhibition between the points of a point pattern and it accounts for clustering only when the data is converted into a multi- type point pattern. As for earthquakes data, some times it can be difficult to split the marks (magnitudes for earthquakes) into a number of categories; as defining and reasoning thresholds for different factors could be vague. The other condition for taking clustering effect of a point pattern into account by Strauss model is that one of the categories of the multi-type pattern keeps a hard core distance, which again in some cases may not be observed in reality and in that case the Strauss model can not be applied as a model of clustering. Another assumption of the point process theory, as discussed earlier, is that all the points in the data are non-overlapping i.e. that they have distinct locations in the study area. This puts one into quandary as what to do if the data set consists of some overlapping points as in that case several of the formulae and interpretations of the point process theory become invalid and one has no choice except to manipulate the data in order to get rid of the duplication. The problem of identical points was encountered in the current study and the duplicated points were distributed randomly within some reasonable distance to overcome the problem. The spatial statistics, however, lacks the modelling techniques for a data set consisting of many overlapping points. Strauss model, in general the Gibbs models, assumes symmetric interactions between the points of a pattern which might not be the case in reality as for earthquakes data the locations of major earthquakes affect the locations of small earthquakes but not vice versa or in other word, the locations of large earthquakes are determinant of small earthquakes in the form of aftershocks but small earthquakes do not have the same effect on the occurrence of large ones. For a realistic model fitting, geological reasoning can not be ignored and the model should be capable of taking into account the non-symmetric interactions. The estimation of interaction radii is another tricky issue in spatial point process modelling for a multi-type marked data as there are no hard and fast rules to determine the interaction radii from the data set. One needs several iterations of model fitting by using different values of interaction radii in order to come up with the values that fit a model best representing the point locations. For the earthquake data analysis, an initial rough estimate of the interaction radii for multi-type Strauss model was drawn from the calculated G- functions and then several iterations of the Strauss model were done and the results compared to select the final matrix of interaction radii. This practice however is time consuming and may not be fully reliable, however there is no technique to estimate the values directly from the data points. Generally the point pattern approach requires the spatial covariates to be spatially continuous i.e. the values of the explanatory variables must be available for each and every point in the study region (pixel image) or at least at some other locations apart from the data points (dummy variables). The point process theory needs to be modified such that the explanatory variables which are available only for the data locations can also be utilized in model fitting.

Page 52: Implementation of Strauss Point Process Model to ... · section of tissue, cases of some disease in a geographical region, collection of forest fire locations etc. Analysis of point

IMPLEMENTATION OF STRAUSS POINT PROCESS MODEL TO EARTHQUAKE DATA

43

For the present study it was assumed that the earthquake data were complete and recorded without any errors or omissions which may not be realistic especially in case of earthquake epicentre locations as the epicentre locations may not be determined with complete accuracy. The possible errors and omissions in the data set could lead to biasing the results and hence it is scientifically crucial to take into account all sources of error. The results of the study are subject to the possible biasing caused by incomplete or inadequately recorded data. Availability of some user-friendly software is crucial for more wide-spread application of spatial point process methodology. Although ‘spatstat’ serves as a very flexible software package, many problems were encountered while experimenting different forms of intensity functions for the multi-type Strauss model- the diagnostic plots like lurking variable plot, smoothed residuals plot and Q-Q plot are not available for multi-type marked point process models. Simulations and envelopes of the fitted models could also not be plotted for all the models. The Strauss model has proved to be effective in modelling the clustering behaviour of the earthquake data to a great extent and gives us confidence for its further applications to the study of point processes in other scientific fields. Some problems detected during the study, however, could serve as important research themes for the future.

Page 53: Implementation of Strauss Point Process Model to ... · section of tissue, cases of some disease in a geographical region, collection of forest fire locations etc. Analysis of point

IMPLEMENTATION OF STRAUSS POINT PROCESS MODEL TO EARTHQUAKE DATA

44

7. Conclusions

The application of Strauss point process model proved satisfactory in explaining the spatial trends and capturing the sources of variability introduced by the explanatory variables. The explanatory variables, apart from the Cartesian coordinates, consisted of the information about the spatial location of the plate boundaries and geological faults in the study area given as pixel images showing shortest distance to the nearest plate boundary and nearest fault location for each pixel. The study showed that the locations of plate boundaries and geological faults are significant determinants for the earthquake epicentre locations. When the effects of both these variables were combined along with the magnitudes and geographic locations of the earthquake epicentres, the modelling was significantly improved. The effects of the explanatory variables were quantified by improvement in AIC values. The improvement in the modelling of earthquake location can also be assessed visually by the plots of fitted trends for different types of earthquakes.

Page 54: Implementation of Strauss Point Process Model to ... · section of tissue, cases of some disease in a geographical region, collection of forest fire locations etc. Analysis of point

IMPLEMENTATION OF STRAUSS POINT PROCESS MODEL TO EARTHQUAKE DATA

45

References

Aguilar, A. M., 2002, Integrating GIS, circular statistics and KDSD for modelling spatial data: A case study: Geographical & Environmental Modelling v. 6, p. 21.

Akaike, H., 1974, A new look at the statistical model identification: IEEE.

Albert, J. M., J. Mateu, and J. C. Pernias, 2002, Modelling of spatial point processes derived from a sequence of auto-Poisson lattice schemes: Environmental Modelling & Software, v. 17, p. 107-125.

Arup, N. P., T. Rossetto, P. Burton, and S. Mahmood, 2005, EEFIT Mission: October 8, 2005 Kashmir Earthquake.

Baddeley, A., 2008, Analysing spatial point patterns in 'R', Australia and New Zealand, CSIRO Australia, p. 199.

Baddeley, A., J. Moller, and A. G. Pakes, 2008, Properties of residuals for spatial point processes: Annals of the Institute of Statistical Mathematics, v. 60, p. 627-649.

Baddeley, A., and B. W. Silverman, 1984, A cautionary example on the use of second-order methods for analyzing point patterns: Biometrics, v. 40, p. 5.

Baddeley, A., and R. Turner, 2005, Spatstat: an {R} package for analyzing spatial point patterns: Journal of Statistical Software, v. 12, p. 42.

Barot, S., J. Gignoux, and J. C. Menaut, 1999, Demography of a savanna palm tree: Predictions from comprehensive spatial pattern analyses: Ecology, v. 80, p. 1987-2005.

Besag, J., and J. Newell, 1991, The detection of clusters in rare diseases: Journal of the Royal Statistical Society, v. 154.

Brunsdon, C., and J. Corcoran, 2006, Using circular statistics to analyse time patterns in crime incidence: Computers Environment and Urban Systems, v. 30, p. 300-319.

Cox, D. R., and V. Isham, 1980, Point proceses: Monographs on statistics and applied probability 12, CRC Press.

Diggle, P. J., 1979, On parameter estimation and goodness-of-fit testing for spatial point patterns: Biometrics v. 35, p. 114.

Diggle, P. J., 2003, Statistical analysis of spatial point patterns: Missing pages available online: London etc., Oxford University Press, 159 p.

Fisher, N. I., 1995, Statistical analysis of circular data, Press Syndicate of the University of Cambridge, 296 p.

Gatrell, A. C., T. C. Bailey, P. J. Diggle, and B. S. Rowlingson, 1996, Spatial point pattern analysis and its application in geographical epidemiology: Transactions of the Institute of British Geographers, v. 21, p. 256-274.

Harri HöUgmander, and A. SäUrkkä, 2009, Multitype spatial point patterns with hierarchical interactions: Biometrics, v. 55, p. 59.

Page 55: Implementation of Strauss Point Process Model to ... · section of tissue, cases of some disease in a geographical region, collection of forest fire locations etc. Analysis of point

IMPLEMENTATION OF STRAUSS POINT PROCESS MODEL TO EARTHQUAKE DATA

46

Hogmander, H., and A. Sarkka, 1999, Multitype spatial point patterns with hierarchical interactions: Biometrics, v. 55, p. 1051-1058.

Holden, L., S. Sannan, and H. Bungum, 2003, A stochastic marked point process model for earthquakes: Natural Hazards and Earth System Sciences, v. 3, p. 7.

Illian, J., 2008, Statistical analysis and modelling of spatial point patterns : e-book, Wiley & Sons, 556 p.

Isham, V., 1984, Multitype markov point processes: some approximations: Proceedings of the Royal Society of London. Series A, Mathematical and Physical Sciences, p. 15.

Kelly, F. P., and B. D. Ripley, 1976, A note on Strauss's model for clustering: Biometrika, v. 63, p. 4.

Kerscher, M., M. J. Pons-Borderia, J. Schmalzing, R. Trasarti-Battistoni, T. Buchert, V. J. Martinez, and R. Valdarnini, 1999, A global descriptor of spatial pattern interaction in the galaxy distribution: Astrophysical Journal, v. 513, p. 543-548.

Mateu, J., M. Albert, and V. Orts, Statistical tools for spatial economics.

Mateu, J., and P. Gregori, Spatial point processes: an overview.

Mateu, J., and M. Montenegro, On kernel estimators of second-order measures for spatial point processes

Mateu, J., J. L. Uso, and F. Montes, 1998, The spatial pattern of a forest ecosystem, p. 163-174.

Moller, J., and R. P. Waagepetersen, 2007, Modern statistics for spatial point processes: Scandinavian Journal of Statistics, v. 34, p. 643-684.

Naranjo, L., 2008, When the earth moved Kashmir.

Perry, G. L. W., B. P. Miller, and N. J. Enright, 2006, A comparison of methods for the statistical analysis of spatial point patterns in plant ecology: Plant Ecology, v. 187, p. 59-82.

Ripley, B. D., 1977, Modelling spatial patterns: Journal of the Royal Statistical Society, v. 39, p. 40.

Schabenberger, O., and F. J. Pierce, 2002, Contemporary statistical models for the plant and soil sciences: Boca Raton etc., CRC Press, 738 p.

Schoenberg, F. P., 2003, Multidimensional residual analysis of point process models for earthquake occurrences: Journal of the American Statistical Association, v. 98, p. 789-795.

Selvin, S., and D. W. Merrill, 2002, Adult leukemia: A spatial analysis: Epidemiology, v. 13.

Siqueira, J. B., C. M. T. Martelli, I. J. Maciel, R. M. Oliveira, M. G. Ribeiro, F. P. Amorim, B. C. Moreira, D. D. P. Cardoso, W. V. Souza, and A. Andrade, 2004, Household survey of dengue infection in central Brazil: Spatial point pattern analysis and risk factors assessment: American Journal of Tropical Medicine and Hygiene, v. 71, p. 646-651.

Stein, A., and N. J. Georgiadis, 2008, Spatial statistics to quantify patterns of herd dispersion in a Savanna herbivore community: In: Resource ecology : spatial and temporal dynamics of foraging / ed. by H.H.T. Prins and F. van Langevelde. Dordrecht : Springer, 2008. ISBN 978-1-4020-6849-2 (Wageningen UR Frontis Series ; 23) pp. 33-51.

Page 56: Implementation of Strauss Point Process Model to ... · section of tissue, cases of some disease in a geographical region, collection of forest fire locations etc. Analysis of point

IMPLEMENTATION OF STRAUSS POINT PROCESS MODEL TO EARTHQUAKE DATA

47

Stoyan, D., and A. Penttinen, 2000, Recent applications of point process methods in forestry statistics: Statistical Science, v. 15, p. 61-78.

Strauss, D. J., 1975, A model for clustering: Biometrika, v. 62, p. 8.

Turner, R., 2007, Point patterns of forest fire locations: Environmental and Ecological Statistics.

Vere-Jones, D., 2006, Some models and procedures for space-time point processes: Environmental and Ecological Statistics.

Vere-Jones, D., and M. Li, 1997, Application of M8 and Lin-Lin algorithms to New Zealand earthquake data: New Zealand Journal of Geology and Geophysics, v. 40, p. 12.

Walter, C., A. B. McBratney, R. A. V. Rossel, and J. A. Markus, 2005, Spatial point-process statistics: concepts and application to the analysis of lead contamination in urban soil: Environmetrics, v. 16, p. 339-355.

Willson, V. L., 2005, Application of circular statistics to psychological functioning Educational research exchange, Texas A & M University.

Yang, J., H. S. He, S. R. Shifley, and E. J. Gustafson, 2007, Spatial patterns of modern period human-caused fire occurrence in the Missouri Ozark Highlands: Forest Science, v. 53, p. 1-15.

Yaseen, M., 2009, Evaluation of optical images sub - pixel correlation for estimating ground deformation, ITC, Enschede, 84 p.

Zheng, X. G., and D. Vere-Jones, 1991, Application of stress release models to historical earthquakes from North China Pure and Applied Geophysics, v. 135.

Zhuang, J. C., 2000, Statistical modelling of seismicity patterns before and after the 1990 Oct 5 Cape Palliser earthquake, New Zealand: New Zealand Journal of Geology and Geophysics, v. 43, p. 447-460.