Exploratory spatial data analysis using Stata -...

91
Introduction Spatial data Visualizing spatial data Exploring spatial point patterns Detecting spatial autocorrelation References Exploratory spatial data analysis using Stata Maurizio Pisati Department of Sociology and Social Research University of Milano-Bicocca (Italy) [email protected] 2012 German Stata Users Group meeting WZB Social Science Research Center, Berlin June 1, 2012 Maurizio Pisati Exploratory spatial data analysis using Stata 1/91

Transcript of Exploratory spatial data analysis using Stata -...

Page 1: Exploratory spatial data analysis using Stata - DPCtypo3.dpc.de/fileadmin/templates/dpc-statistik/doc_pdf/stata_user... · Introduction Spatial data Visualizing spatial data Exploring

IntroductionSpatial data

Visualizing spatial dataExploring spatial point patternsDetecting spatial autocorrelation

References

Exploratory spatial data analysis using Stata

Maurizio Pisati

Department of Sociology and Social ResearchUniversity of Milano-Bicocca (Italy)

[email protected]

2012 German Stata Users Group meetingWZB Social Science Research Center, Berlin

June 1, 2012

Maurizio Pisati Exploratory spatial data analysis using Stata 1/91

Page 2: Exploratory spatial data analysis using Stata - DPCtypo3.dpc.de/fileadmin/templates/dpc-statistik/doc_pdf/stata_user... · Introduction Spatial data Visualizing spatial data Exploring

IntroductionSpatial data

Visualizing spatial dataExploring spatial point patternsDetecting spatial autocorrelation

References

Outline

1 Introduction

2 Spatial data

3 Visualizing spatial dataOverviewDot mapsProportional symbol mapsDiagram mapsChoropleth mapsMultivariate maps

Maurizio Pisati Exploratory spatial data analysis using Stata 2/91

Page 3: Exploratory spatial data analysis using Stata - DPCtypo3.dpc.de/fileadmin/templates/dpc-statistik/doc_pdf/stata_user... · Introduction Spatial data Visualizing spatial data Exploring

IntroductionSpatial data

Visualizing spatial dataExploring spatial point patternsDetecting spatial autocorrelation

References

Outline

4 Exploring spatial point patternsOverviewKernel density estimation

5 Detecting spatial autocorrelationOverviewSpatial weights matricesMeasuring spatial autocorrelationGlobal indices of spatial autocorrelationLocal indices of spatial autocorrelation

6 References

Maurizio Pisati Exploratory spatial data analysis using Stata 3/91

Page 4: Exploratory spatial data analysis using Stata - DPCtypo3.dpc.de/fileadmin/templates/dpc-statistik/doc_pdf/stata_user... · Introduction Spatial data Visualizing spatial data Exploring

IntroductionSpatial data

Visualizing spatial dataExploring spatial point patternsDetecting spatial autocorrelation

References

Introduction

Maurizio Pisati Exploratory spatial data analysis using Stata 4/91

Page 5: Exploratory spatial data analysis using Stata - DPCtypo3.dpc.de/fileadmin/templates/dpc-statistik/doc_pdf/stata_user... · Introduction Spatial data Visualizing spatial data Exploring

IntroductionSpatial data

Visualizing spatial dataExploring spatial point patternsDetecting spatial autocorrelation

References

Exploratory spatial data analysis

• Exploratory spatial data analysis (Esda) is theextension of exploratory data analysis (Eda) to theproblem of detecting patterns in spatial data (Haining etal. 1998: 457)

• Esda involves seeking good descriptions of spatial data, soas to help the analyst to develop hypotheses and models forsuch data (Bailey and Gatrell 1995: 23)

• Esda emphasizes graphical views of the data, designed tohighlight meaningful clusters of observations, unusualobservations, or relationships between variables. Theseviews often take the form of maps

Maurizio Pisati Exploratory spatial data analysis using Stata 5/91

Page 6: Exploratory spatial data analysis using Stata - DPCtypo3.dpc.de/fileadmin/templates/dpc-statistik/doc_pdf/stata_user... · Introduction Spatial data Visualizing spatial data Exploring

IntroductionSpatial data

Visualizing spatial dataExploring spatial point patternsDetecting spatial autocorrelation

References

Esda in Stata

• Stata users can perform Esda using a variety ofuser-written commands published in the Stata TechnicalBulletin, the Stata Journal, or the SSC Archive

• In this talk, I will briefly illustrate the use of six suchcommands: spmap, spgrid, spkde, spatwmat, spatgsa,and spatlsa

Maurizio Pisati Exploratory spatial data analysis using Stata 6/91

Page 7: Exploratory spatial data analysis using Stata - DPCtypo3.dpc.de/fileadmin/templates/dpc-statistik/doc_pdf/stata_user... · Introduction Spatial data Visualizing spatial data Exploring

IntroductionSpatial data

Visualizing spatial dataExploring spatial point patternsDetecting spatial autocorrelation

References

Esda in Stata

• spmap is a general command aimed at visualizing severalkinds of spatial data

• spgrid generates two-dimensional grids coveringrectangular or irregular study areas

• spkde implements a variety of nonparametric kernel-basedestimators of the probability density function and theintensity function of two-dimensional spatial point patterns

• spatwmat imports or generates several kinds of spatialweights matrices

• spatgsa computes global indices of spatial autocorrelation

• spatlsa computes local indices of spatial autocorrelation

Maurizio Pisati Exploratory spatial data analysis using Stata 7/91

Page 8: Exploratory spatial data analysis using Stata - DPCtypo3.dpc.de/fileadmin/templates/dpc-statistik/doc_pdf/stata_user... · Introduction Spatial data Visualizing spatial data Exploring

IntroductionSpatial data

Visualizing spatial dataExploring spatial point patternsDetecting spatial autocorrelation

References

Spatial data

Maurizio Pisati Exploratory spatial data analysis using Stata 8/91

Page 9: Exploratory spatial data analysis using Stata - DPCtypo3.dpc.de/fileadmin/templates/dpc-statistik/doc_pdf/stata_user... · Introduction Spatial data Visualizing spatial data Exploring

IntroductionSpatial data

Visualizing spatial dataExploring spatial point patternsDetecting spatial autocorrelation

References

Spatial data: a discrete view

• For simplicity, let us represent space as a plane, i.e., as aflat two-dimensional surface

• In spatial data analysis, we can distinguish two conceptionsof space (Bailey and Gatrell 1995: 18):

• Entity view : Space as an area filled with a set of discreteobjects

• Field view : Space as an area covered with essentiallycontinuous surfaces

• Here we take the former view and define spatial data asinformation regarding a given set of discrete spatial objectslocated within a study area A

Maurizio Pisati Exploratory spatial data analysis using Stata 9/91

Page 10: Exploratory spatial data analysis using Stata - DPCtypo3.dpc.de/fileadmin/templates/dpc-statistik/doc_pdf/stata_user... · Introduction Spatial data Visualizing spatial data Exploring

IntroductionSpatial data

Visualizing spatial dataExploring spatial point patternsDetecting spatial autocorrelation

References

Attributes of spatial objects

• Information about spatial objects can be classified into twocategories:

• Spatial attributes• Non-spatial attributes

• The spatial attributes of a spatial object consist of oneor more pairs of coordinates that represent its shapeand/or its location within the study area

• The non-spatial attributes of a spatial object consist ofits additional features that are relevant to the analysis athand

Maurizio Pisati Exploratory spatial data analysis using Stata 10/91

Page 11: Exploratory spatial data analysis using Stata - DPCtypo3.dpc.de/fileadmin/templates/dpc-statistik/doc_pdf/stata_user... · Introduction Spatial data Visualizing spatial data Exploring

IntroductionSpatial data

Visualizing spatial dataExploring spatial point patternsDetecting spatial autocorrelation

References

Types of spatial objects

• According to their spatial attributes, spatial objects canbe classified into several types

• Here, we focus on two basic types:• Points (point data)• Polygons (area data)

Maurizio Pisati Exploratory spatial data analysis using Stata 11/91

Page 12: Exploratory spatial data analysis using Stata - DPCtypo3.dpc.de/fileadmin/templates/dpc-statistik/doc_pdf/stata_user... · Introduction Spatial data Visualizing spatial data Exploring

IntroductionSpatial data

Visualizing spatial dataExploring spatial point patternsDetecting spatial autocorrelation

References

Points

• A point si is a zero-dimensionalspatial object located withinstudy area A at coordinates(si1, si2)

• Points can represent several kindsof real entities, e.g., dwellings,buildings, places where specificevents took place, pollutionsources, trees

Washington D.C. (2009)

Homicides

Maurizio Pisati Exploratory spatial data analysis using Stata 12/91

Page 13: Exploratory spatial data analysis using Stata - DPCtypo3.dpc.de/fileadmin/templates/dpc-statistik/doc_pdf/stata_user... · Introduction Spatial data Visualizing spatial data Exploring

IntroductionSpatial data

Visualizing spatial dataExploring spatial point patternsDetecting spatial autocorrelation

References

Polygons

• A polygon ri is a region of studyarea A bounded by a closedpolygonal chain whose M ≥ 4vertices are defined by thecoordinate set {(ri1(1), ri2(1)),(ri1(2), ri2(2)), . . . , (ri1(m), ri2(m)),. . . , (ri1(M), ri2(M))}, whereri1(1) = ri1(M) and ri2(1) = ri2(M)

• Polygons can represent severalkinds of real entities, e.g., states,provinces, counties, census tracts,electoral districts, parks, lakes

First Sixth

Fourth

Fifth

Seventh

SecondThird

Washington D.C.

Police Districts

Maurizio Pisati Exploratory spatial data analysis using Stata 13/91

Page 14: Exploratory spatial data analysis using Stata - DPCtypo3.dpc.de/fileadmin/templates/dpc-statistik/doc_pdf/stata_user... · Introduction Spatial data Visualizing spatial data Exploring

IntroductionSpatial data

Visualizing spatial dataExploring spatial point patternsDetecting spatial autocorrelation

References

OverviewDot mapsProportional symbol mapsDiagram mapsChoropleth mapsMultivariate maps

Visualizing spatial data

Maurizio Pisati Exploratory spatial data analysis using Stata 14/91

Page 15: Exploratory spatial data analysis using Stata - DPCtypo3.dpc.de/fileadmin/templates/dpc-statistik/doc_pdf/stata_user... · Introduction Spatial data Visualizing spatial data Exploring

IntroductionSpatial data

Visualizing spatial dataExploring spatial point patternsDetecting spatial autocorrelation

References

OverviewDot mapsProportional symbol mapsDiagram mapsChoropleth mapsMultivariate maps

Mapping

• Most exploratory analyses of spatial data have theirnatural starting point in displaying the information ofinterest by one or more maps

• If properly designed, maps can help the analyst to detectinteresting patterns in the data, spatial relationshipsbetween two or more phenomena, unusual observations,and so on

Maurizio Pisati Exploratory spatial data analysis using Stata 15/91

Page 16: Exploratory spatial data analysis using Stata - DPCtypo3.dpc.de/fileadmin/templates/dpc-statistik/doc_pdf/stata_user... · Introduction Spatial data Visualizing spatial data Exploring

IntroductionSpatial data

Visualizing spatial dataExploring spatial point patternsDetecting spatial autocorrelation

References

OverviewDot mapsProportional symbol mapsDiagram mapsChoropleth mapsMultivariate maps

Thematic maps

• In this talk I consider only the kind of maps most useful toEsda: thematic maps

• Thematic maps represent the spatial distribution of aphenomenon of interest within a given study area (Slocumet al. 2005)

Maurizio Pisati Exploratory spatial data analysis using Stata 16/91

Page 17: Exploratory spatial data analysis using Stata - DPCtypo3.dpc.de/fileadmin/templates/dpc-statistik/doc_pdf/stata_user... · Introduction Spatial data Visualizing spatial data Exploring

IntroductionSpatial data

Visualizing spatial dataExploring spatial point patternsDetecting spatial autocorrelation

References

OverviewDot mapsProportional symbol mapsDiagram mapsChoropleth mapsMultivariate maps

Thematic maps in Stata

• Stata users can generate thematic maps using spmap, auser-written command freely available from the SSCArchive (latest version: 1.2.0)

• spmap is a very flexible command that allows for creating alarge variety of thematic maps, from the simplest to themost complex

• While providing sensible defaults for most options andsupoptions, spmap gives the user full control over theformatting of almost every map element, thus allowing theproduction of highly customized maps

Maurizio Pisati Exploratory spatial data analysis using Stata 17/91

Page 18: Exploratory spatial data analysis using Stata - DPCtypo3.dpc.de/fileadmin/templates/dpc-statistik/doc_pdf/stata_user... · Introduction Spatial data Visualizing spatial data Exploring

IntroductionSpatial data

Visualizing spatial dataExploring spatial point patternsDetecting spatial autocorrelation

References

OverviewDot mapsProportional symbol mapsDiagram mapsChoropleth mapsMultivariate maps

Thematic maps in Stata

• In the following, I will show how to use spmap for creatingthe types of thematic maps most commonly used in Esda:

• Dot maps• Proportional symbol maps• Diagram maps• Choropleth maps• Multivariate maps

Maurizio Pisati Exploratory spatial data analysis using Stata 18/91

Page 19: Exploratory spatial data analysis using Stata - DPCtypo3.dpc.de/fileadmin/templates/dpc-statistik/doc_pdf/stata_user... · Introduction Spatial data Visualizing spatial data Exploring

IntroductionSpatial data

Visualizing spatial dataExploring spatial point patternsDetecting spatial autocorrelation

References

OverviewDot mapsProportional symbol mapsDiagram mapsChoropleth mapsMultivariate maps

Dot maps

• A dot map shows the spatial distribution of a set of pointspatial objects S ≡ {si; i = 1, . . . , N}, i.e., their locationwithin a given study area A

• If the point spatial objects have variable attributes, it ispossible to represent this information using symbols ofdifferent colors and/or of different shape

Maurizio Pisati Exploratory spatial data analysis using Stata 19/91

Page 20: Exploratory spatial data analysis using Stata - DPCtypo3.dpc.de/fileadmin/templates/dpc-statistik/doc_pdf/stata_user... · Introduction Spatial data Visualizing spatial data Exploring

IntroductionSpatial data

Visualizing spatial dataExploring spatial point patternsDetecting spatial autocorrelation

References

OverviewDot mapsProportional symbol mapsDiagram mapsChoropleth mapsMultivariate maps

Dot maps: example 1

Spatial distribution of 359 cases of sexabuse, Washington D.C. (2009)

!"#$%&'()#*++,-./0%!$12#0'$3#4#'0/#$567$"$#$$"8)08$!"(43$%9:!4.0'(#"-./0%!$(.%567&$;1:2:'%#33"<#22&$$$$$$'''$$$$8:(4/%=%=51::'.&$>%>51::'.&$"#2#1/%?##8$(;$:;;#4"#""(&$$$'''$$$$$$"(@#%)*+,&$;1:2:'%'#.&$:1:2:'%A<(/#&$:"(@#%)-+.&&$$$$$$'''$$$$/(/2#%%B#=$0C!"#"%!$"(@#%)*+,&&$$$$$$$$$$$$$$$$$$$$$$$$$$'''$$$$"!C/(/2#%%D0"<(43/:4$7-&-$E*++,F%$%$%!$"(@#%)*+,&&!

Washington D.C. (2009)

Sex abuses

Maurizio Pisati Exploratory spatial data analysis using Stata 20/91

Page 21: Exploratory spatial data analysis using Stata - DPCtypo3.dpc.de/fileadmin/templates/dpc-statistik/doc_pdf/stata_user... · Introduction Spatial data Visualizing spatial data Exploring

IntroductionSpatial data

Visualizing spatial dataExploring spatial point patternsDetecting spatial autocorrelation

References

OverviewDot mapsProportional symbol mapsDiagram mapsChoropleth mapsMultivariate maps

Dot maps: example 2

Spatial distribution of 359 cases of sexabuse, Washington D.C. (2009). Differentcolors are used to distinguish adultvictims from child victims

!"#$%&'()#*++,-./0%!$12#0'$3#4#'0/#$567$"$#$$3#4#'0/#$8(1/()$"$)#/9:.$'#1:.#$8(1/()$%&'(")*%)(')+",*%-".*$20;#2$.#<(4#$8(1/()$)$%=.!2/%$,$%&9(2.%$20;#2$802!#"$8(1/()$8(1/()$">)0>$!"(43$%?:!4.0'(#"-./0%!$(.%567*$<1:2:'%#33"9#22*$$$$$$'''$$$$>:(4/%@%@51::'.*$A%A51::'.*$"#2#1/%B##>$(<$:<<#4"#""/*$$$'''$$$$$$;A%8(1/()*$"(C#%-).,*$<1:2:'%'#.$408A*$$$$$$$$$$$$$$$$$'''$$$$$$:1:2:'%D9(/#$..*$:"(C#%-0.1$..*$2#3#4.0%:4*$$$$$$$$$$$$'''$$$$$$2#31:!4/*$$$$$$$$$$$$$$$$$$$$$$$$$$$$$$$$$$$$$$$$$$$$$$'''$$$$2#3#4.%"(C#%-).+*$':D30>%).,**$$$$$$$$$$$$$$$$$$$$$$$$$$$'''$$$$/(/2#%%E#@$0;!"#"F$;A$8(1/()$03#%!$"(C#%-).,**$$$$$$$$$$$'''$$$$"!;/(/2#%%G0"9(43/:4$7-&-$H*++,I%$%$%!$"(C#%-).,**!

Adult (155)Child (204)

Washington D.C. (2009)

Sex abuses, by victim age

Maurizio Pisati Exploratory spatial data analysis using Stata 21/91

Page 22: Exploratory spatial data analysis using Stata - DPCtypo3.dpc.de/fileadmin/templates/dpc-statistik/doc_pdf/stata_user... · Introduction Spatial data Visualizing spatial data Exploring

IntroductionSpatial data

Visualizing spatial dataExploring spatial point patternsDetecting spatial autocorrelation

References

OverviewDot mapsProportional symbol mapsDiagram mapsChoropleth mapsMultivariate maps

Dot maps: example 3

Spatial distribution of 359 cases of sexabuse, Washington D.C. (2009). Majorroads, watercourses and parks are addedto the map for reference

!"#$%&'()#*++,-./0%!$12#0'$3#4#'0/#$567$"$#$$"8)08$!"(43$%9:!4.0'(#"-./0%!$(.%567&$;1:2:'%#33"<#22&$$$$$$'''$$$$8:(4/%=%=51::'.&$>%>51::'.&$"#2#1/%?##8$(;$:;;#4"#""(&$$$'''$$$$$$"(@#%)*+,&$;1:2:'%'#.&$:1:2:'%A<(/#&$:"(@#%)-+.&&$$$$$$'''$$$$8:2>3:4%.0/0%%B0/#'CD0'?"-./0%&$E>%/>8#&$$$$$$$$$$$$$$$$$'''$$$$$$:1:2:'%4:4#$++&$;1:2:'%3'##4$E2!#&&$$$$$$$$$$$$$$$$$$$$'''$$$$2(4#%.0/0%%F0G:'H:0."-./0%&$1:2:'%E':A4&&$$$$$$$$$$$$$$$$'''$$$$/(/2#%%I#=$0E!"#"%!$"(@#%)*+,&&$$$$$$$$$$$$$$$$$$$$$$$$$$'''$$$$"!E/(/2#%%B0"<(43/:4$7-&-$J*++,K%$%$%!$"(@#%)*+,&&!

Washington D.C. (2009)

Sex abuses

Maurizio Pisati Exploratory spatial data analysis using Stata 22/91

Page 23: Exploratory spatial data analysis using Stata - DPCtypo3.dpc.de/fileadmin/templates/dpc-statistik/doc_pdf/stata_user... · Introduction Spatial data Visualizing spatial data Exploring

IntroductionSpatial data

Visualizing spatial dataExploring spatial point patternsDetecting spatial autocorrelation

References

OverviewDot mapsProportional symbol mapsDiagram mapsChoropleth mapsMultivariate maps

Proportional symbol maps

• A proportional symbol map represents the values takenby a numeric variable of interest Y on a set of point spatialobjects S located within a given study area A

• Proportional symbol maps can be used with two types ofpoint data (Slocum et al. 2005: 310):

• True point data are measured at actual point locations• Conceptual point data are collected over a set of regions

R ≡ {ri; i = 1, . . . , N}, but are conceived as being locatedat representative points within the regions, typically attheir centroids

• The area of each point symbol is sized in direct proportionto the corresponding value of Y

Maurizio Pisati Exploratory spatial data analysis using Stata 23/91

Page 24: Exploratory spatial data analysis using Stata - DPCtypo3.dpc.de/fileadmin/templates/dpc-statistik/doc_pdf/stata_user... · Introduction Spatial data Visualizing spatial data Exploring

IntroductionSpatial data

Visualizing spatial dataExploring spatial point patternsDetecting spatial autocorrelation

References

OverviewDot mapsProportional symbol mapsDiagram mapsChoropleth mapsMultivariate maps

Proportional symbol maps: example 1

Mean family income in the seven PoliceDistricts of Washington D.C. (2000)

!"#$%&'()*#+)",-)*,".+/,/01,/%!$*(#/-$2#3#-/,#$4$"$)3*'5#65/#$%%%$7'-5/,$4$&'($7$"85/8$!")32$%&'()*#+)",-)*,".9''-1)3/,#"01,/%!$)1))1*$$$$$$$###$$$$7*'('-)#22":#((*$$$$$$$$$$$$$$$$$$$$$$$$$$$$$$$$$$$$$$$$$###$$$$8')3,););6*''-1*$<)<6*''-1*$8-'8'-,)'3/()4*$7*'('-)-#1*$$###$$$$$$'*'('-)=:),#*$")>#)+,(-**$$$$$$$$$$$$$$$$$$$$$$$$$$$$$$###$$$$(/?#(););6*''-1*$<*''-1)<6*''-1*$(/?#()4*$*'('-)=:),#*$$$###$$$$$$")>#)+$('**$$$$$$$$$$$$$$$$$$$$$$$$$$$$$$$$$$$$$$$$$$$$###$$$$,),(#)%@#/3$7/5)(<$)3*'5#$A)3$,:'!"/31"$'7$BC$1'((/-"D%*$###$$$$"!?,),(#)%E/":)32,'3$+090$AFGGGD%$%$%*!

26.1 12.7

19.6

14.0

8.1

51.522.7

Washington D.C. (2000)

Mean family income (in thousands of US dollars)

Maurizio Pisati Exploratory spatial data analysis using Stata 24/91

Page 25: Exploratory spatial data analysis using Stata - DPCtypo3.dpc.de/fileadmin/templates/dpc-statistik/doc_pdf/stata_user... · Introduction Spatial data Visualizing spatial data Exploring

IntroductionSpatial data

Visualizing spatial dataExploring spatial point patternsDetecting spatial autocorrelation

References

OverviewDot mapsProportional symbol mapsDiagram mapsChoropleth mapsMultivariate maps

Proportional symbol maps: example 2

Mean family income in the 188 CensusTracts of Washington D.C. (2000). Solidcircles denote positive deviations fromthe overall mean income, hollow circlesdenote negative deviations. Circles aredrawn with size proportional to theabsolute value of the deviation

!"#$%&#'"!"()))*+,-,./-,%!$01#,2$3#'#2,-#$4$"$5'067#87,#$%%%$"97,9$!"5'3$%&#'"!"()))*&662/5',-#"./-,%!$5/&5/'$$$$$$$$$$$###$$$$:06162&#33";#11'$$$$$$$$$$$$$$$$$$$$$$$$$$$$$$$$$$$$$$$$###$$$$965'-&<&<80662/'$=&=80662/'$:06162&2#/'$606162&>;5-#'$$$###$$$$$$"5?#&(%)*'$/#@5,-56'&4'$2#:>#53;-&969-6-''$$$$$$$$$$$$###$$$$-5-1#&%A#,'$:,751=$5'067#%'$$$$$$$$$$$$$$$$$$$$$$$$$$$$$###$$$$"!B-5-1#&%C,";5'3-6'$+.&.$D()))E%$%$%'!

Washington D.C. (2000)

Mean family income

Maurizio Pisati Exploratory spatial data analysis using Stata 25/91

Page 26: Exploratory spatial data analysis using Stata - DPCtypo3.dpc.de/fileadmin/templates/dpc-statistik/doc_pdf/stata_user... · Introduction Spatial data Visualizing spatial data Exploring

IntroductionSpatial data

Visualizing spatial dataExploring spatial point patternsDetecting spatial autocorrelation

References

OverviewDot mapsProportional symbol mapsDiagram mapsChoropleth mapsMultivariate maps

Diagram maps

• A diagram map follows the same logic as a proportionalsymbol map, but represents the values of the variable ofinterest using bar charts, pie charts, or other types ofdiagram

• The use of pie charts allows to display the spatialdistribution of compositional data, i.e., of two or morenumeric variables that represent parts of a whole

Maurizio Pisati Exploratory spatial data analysis using Stata 26/91

Page 27: Exploratory spatial data analysis using Stata - DPCtypo3.dpc.de/fileadmin/templates/dpc-statistik/doc_pdf/stata_user... · Introduction Spatial data Visualizing spatial data Exploring

IntroductionSpatial data

Visualizing spatial dataExploring spatial point patternsDetecting spatial autocorrelation

References

OverviewDot mapsProportional symbol mapsDiagram mapsChoropleth mapsMultivariate maps

Diagram maps: example 1

Mean family income in the seven PoliceDistricts of Washington D.C. (2000).Data are represented by framed-rectanglecharts, with the overall mean income asthe reference value

!"#$%&'()*#+)",-)*,".+/,/01,/%!$*(#/-$"23/2$!")45$%&'()*#+)",-)*,".6''-1)4/,#"01,/%!$)1")1#$$$$$$$$$$$$$$7*'('-"#55"8#((#$$$$$$$$$$$$$$$$$$$$$$$$$$$$$$$$$$$$$$$$$$$$$$$$1)/5-/3"9/-")4*'3#:3/#$-#7;#)58,"2'2,',#$7*'('-"5-##4#$$$$$$$$$$$$$<"<:*''-1#$="=:*''-1#$")>#"%&'##$$$$$$$$$$$$$$$$$$$$$$$$$$$$$,),(#"%?#/4$7/3)(=$)4*'3#%#$$$$$$$$$$$$$$$$$$$$$$$$$$$$$$$$$$$$$"!@,),(#"%A/"8)45,'4$+060$BCDDDE%$%$%#!

Washington D.C. (2000)

Mean family income

Maurizio Pisati Exploratory spatial data analysis using Stata 27/91

Page 28: Exploratory spatial data analysis using Stata - DPCtypo3.dpc.de/fileadmin/templates/dpc-statistik/doc_pdf/stata_user... · Introduction Spatial data Visualizing spatial data Exploring

IntroductionSpatial data

Visualizing spatial dataExploring spatial point patternsDetecting spatial autocorrelation

References

OverviewDot mapsProportional symbol mapsDiagram mapsChoropleth mapsMultivariate maps

Diagram maps: example 2

Race/Ethnic composition of thepopulation of the seven Police Districts ofWashington D.C. (2000). Data arerepresented by pie charts

!"#$%&'()*#+)",-)*,".+/,/01,/%!$*(#/-$2#3#-/,#$45),#67*,$"$7'7645),##7'7,',$%&&$2#3#-/,#$/8-'/967*,$"$7'76/8-'/9#7'7,',$%&&$2#3#-/,#$',5#-67*,$"$7'76',5#-#7'7,',$%&&$(/:#($;/-)/:(#$45),#67*,$%<5),#%$(/:#($;/-)/:(#$/8-'/967*,$%=8-)*/3$=9#-)*/3%$(/:#($;/-)/:(#$',5#-67*,$%>,5#-%$"79/7$!")32$%&'()*#+)",-)*,".?''-1)3/,#"01,/%!$)1')1($$$$$$$###$$$$8*'('-'",'3#($$$$$$$$$$$$$$$$$$$$$$$$$$$$$$$$$$$$$$$$$$$$###$$$$1)/2-/9';/-'45),#67*,$/8-'/967*,$',5#-67*,($@'@6*''-1($$$###$$$$$$A'A6*''-1($8*'('-'#22"5#(($-#1$'-/32#($")B#'%)*($$$$$$$###$$$$$$(#2#31/''3(($$$$$$$$$$$$$$$$$$$$$$$$$$$$$$$$$$$$$$$$$$$###$$$$(#2#31'")B#'$%)+(($$$$$$$$$$$$$$$$$$$$$$$$$$$$$$$$$$$$$$$###$$$$,),(#'%C/*#DE,53)*$*'97'"),)'3$'8$,5#$7'7!(/,)'3%($$$$$$$###$$$$"!:,),(#'%</"5)32,'3$+0?0$FGHHHI%$%$%(!

WhiteAfrican AmericanOther

Washington D.C. (2000)

Race/Ethnic composition of the population

Maurizio Pisati Exploratory spatial data analysis using Stata 28/91

Page 29: Exploratory spatial data analysis using Stata - DPCtypo3.dpc.de/fileadmin/templates/dpc-statistik/doc_pdf/stata_user... · Introduction Spatial data Visualizing spatial data Exploring

IntroductionSpatial data

Visualizing spatial dataExploring spatial point patternsDetecting spatial autocorrelation

References

OverviewDot mapsProportional symbol mapsDiagram mapsChoropleth mapsMultivariate maps

Choropleth maps

• A choropleth map displays the values taken by a variableof interest Y on a set of regions R within a given studyarea A

• When Y is numeric, each region is colored or shadedaccording to a discrete scale based on its value on Y

• The number of classes k that make up the discrete scale,and the corresponding class breaks, can be based on severaldifferent criteria – e.g., quantiles, equal intervals, boxplot,standard deviates

Maurizio Pisati Exploratory spatial data analysis using Stata 29/91

Page 30: Exploratory spatial data analysis using Stata - DPCtypo3.dpc.de/fileadmin/templates/dpc-statistik/doc_pdf/stata_user... · Introduction Spatial data Visualizing spatial data Exploring

IntroductionSpatial data

Visualizing spatial dataExploring spatial point patternsDetecting spatial autocorrelation

References

OverviewDot mapsProportional symbol mapsDiagram mapsChoropleth mapsMultivariate maps

Choropleth maps: example 1

Mean family income in the 188 CensusTracts of Washington D.C. (2000).Income is divided into six classes basedon the quantiles method

!"#$%&#'"!"()))*+,-,./-,%!$01#,2$3#'#2,-#$4$"$5'067#87,#$%%%$9627,-$4$&'(%9$":7,:$4$!"5'3$%&#'"!"()))*&662/5',-#"./-,%!$5/)5/*$$$$$$$$$$###$$$$01'!7;#2)+*$017#-<6/)=!,'-51#*$906162)>!?/*$$$$$$$$$$$$$$###$$$$'/906162)3"@*$'/1,;)%A5""5'3%*$$$$$$$$$$$$$$$$$$$$$$$$$$$###$$$$1#3#'/)"5B#),$(-**$$$$$$$$$$$$$$$$$$$$$$$$$$$$$$$$$$$$$$$###$$$$-5-1#)%A#,'$9,751C$5'067#$D5'$-<6!",'/"$69$EF$/611,2"G%*$###$$$$"!;-5-1#)%H,"<5'3-6'$+.&.$D()))G%$%$%*!

(93,341](54,93](40,54](33,40](28,33][6,28]Missing

Washington D.C. (2000)

Mean family income (in thousands of US dollars)

Maurizio Pisati Exploratory spatial data analysis using Stata 30/91

Page 31: Exploratory spatial data analysis using Stata - DPCtypo3.dpc.de/fileadmin/templates/dpc-statistik/doc_pdf/stata_user... · Introduction Spatial data Visualizing spatial data Exploring

IntroductionSpatial data

Visualizing spatial dataExploring spatial point patternsDetecting spatial autocorrelation

References

OverviewDot mapsProportional symbol mapsDiagram mapsChoropleth mapsMultivariate maps

Choropleth maps: example 2

Mean family income in the 188 CensusTracts of Washington D.C. (2000).Income is divided into six classes basedon the equal intervals method

!"#$%&#'"!"()))*+,-,./-,%!$01#,2$3#'#2,-#$4$"$5'067#87,#$%%%$9627,-$4$&'(%9$":7,:$4$!"5'3$%&#'"!"()))*&662/5',-#"./-,%!$5/)5/*$$$$$$$$$$###$$$$01'!7;#2)+*$017#-<6/)#=5'-*$906162)>!?/*$$$$$$$$$$$$$$$$$###$$$$'/906162)3"@*$'/1,;)%A5""5'3%*$$$$$$$$$$$$$$$$$$$$$$$$$$$###$$$$1#3#'/)"5B#),$(-**$$$$$$$$$$$$$$$$$$$$$$$$$$$$$$$$$$$$$$$###$$$$-5-1#)%A#,'$9,751C$5'067#$D5'$-<6!",'/"$69$EF$/611,2"G%*$###$$$$"!;-5-1#)%H,"<5'3-6'$+.&.$D()))G%$%$%*!

(285,341](230,285](174,230](118,174](62,118][6,62]Missing

Washington D.C. (2000)

Mean family income (in thousands of US dollars)

Maurizio Pisati Exploratory spatial data analysis using Stata 31/91

Page 32: Exploratory spatial data analysis using Stata - DPCtypo3.dpc.de/fileadmin/templates/dpc-statistik/doc_pdf/stata_user... · Introduction Spatial data Visualizing spatial data Exploring

IntroductionSpatial data

Visualizing spatial dataExploring spatial point patternsDetecting spatial autocorrelation

References

OverviewDot mapsProportional symbol mapsDiagram mapsChoropleth mapsMultivariate maps

Choropleth maps: example 3

Mean family income in the 188 CensusTracts of Washington D.C. (2000).Income is divided into six classes basedon the boxplot method

!"#$%&#'"!"()))*+,-,./-,%!$01#,2$3#'#2,-#$4$"$5'067#87,#$%%%$9627,-$4$&'(%9$":7,:$4$!"5'3$%&#'"!"()))*&662/5',-#"./-,%!$5/)5/*$$$$$$$$$$###$$$$01'!7;#2)+*$017#-<6/);6=:16-*$906162)>!?/*$$$$$$$$$$$$$$$###$$$$'/906162)3"@*$'/1,;)%A5""5'3%*$$$$$$$$$$$$$$$$$$$$$$$$$$$###$$$$1#3#'/)"5B#),$(-**$$$$$$$$$$$$$$$$$$$$$$$$$$$$$$$$$$$$$$$###$$$$-5-1#)%A#,'$9,751C$5'067#$D5'$-<6!",'/"$69$EF$/611,2"G%*$###$$$$"!;-5-1#)%H,"<5'3-6'$+.&.$D()))G%$%$%*!

(132,341](71,132](40,71](31,40](6,31][6,6]Missing

Washington D.C. (2000)

Mean family income (in thousands of US dollars)

Maurizio Pisati Exploratory spatial data analysis using Stata 32/91

Page 33: Exploratory spatial data analysis using Stata - DPCtypo3.dpc.de/fileadmin/templates/dpc-statistik/doc_pdf/stata_user... · Introduction Spatial data Visualizing spatial data Exploring

IntroductionSpatial data

Visualizing spatial dataExploring spatial point patternsDetecting spatial autocorrelation

References

OverviewDot mapsProportional symbol mapsDiagram mapsChoropleth mapsMultivariate maps

Choropleth maps: example 4

Mean family income in the 188 CensusTracts of Washington D.C. (2000).Income is divided into four classes basedon the standard deviates method

!"#$%&#'"!"()))*+,-,./-,%!$01#,2$3#'#2,-#$4$"$5'067#87,#$%%%$9627,-$4$&'(%9$":7,:$4$!"5'3$%&#'"!"()))*&662/5',-#"./-,%!$5/)5/*$$$$$$$$$$###$$$$01'!7;#2)+*$017#-<6/)"-/#=*$906162)>!?/*$$$$$$$$$$$$$$$$$###$$$$'/906162)3"@*$'/1,;)%A5""5'3%*$$$$$$$$$$$$$$$$$$$$$$$$$$$###$$$$1#3#'/)"5B#),$(+**$$$$$$$$$$$$$$$$$$$$$$$$$$$$$$$$$$$$$$$###$$$$-5-1#)%A#,'$9,751C$5'067#$D5'$-<6!",'/"$69$EF$/611,2"G%*$###$$$$"!;-5-1#)%H,"<5'3-6'$+.&.$D()))G%$%$%*!

(110,341](59,110](9,59][6,9]Missing

Washington D.C. (2000)

Mean family income (in thousands of US dollars)

Maurizio Pisati Exploratory spatial data analysis using Stata 33/91

Page 34: Exploratory spatial data analysis using Stata - DPCtypo3.dpc.de/fileadmin/templates/dpc-statistik/doc_pdf/stata_user... · Introduction Spatial data Visualizing spatial data Exploring

IntroductionSpatial data

Visualizing spatial dataExploring spatial point patternsDetecting spatial autocorrelation

References

OverviewDot mapsProportional symbol mapsDiagram mapsChoropleth mapsMultivariate maps

Multivariate maps

• A multivariate map combines several types of thematicmapping to simultaneously display the spatial distributionof multiple phenomena within a given study area A

Maurizio Pisati Exploratory spatial data analysis using Stata 34/91

Page 35: Exploratory spatial data analysis using Stata - DPCtypo3.dpc.de/fileadmin/templates/dpc-statistik/doc_pdf/stata_user... · Introduction Spatial data Visualizing spatial data Exploring

IntroductionSpatial data

Visualizing spatial dataExploring spatial point patternsDetecting spatial autocorrelation

References

OverviewDot mapsProportional symbol mapsDiagram mapsChoropleth mapsMultivariate maps

Multivariate maps: example 1

The map shows the relationship betweenmean family income (represented byframed-rectangle charts) and pct. whitepopulation (represented by shades ofcolor) across the seven Police Districts ofWashington D.C. (2000)

!"#$%&'()*#+)",-)*,".+/,/01,/%!$*(#/-$2#3#-/,#$4$"$5'5678),##5'5,',$%&&$9'-:/,$4$'()&9$"5:/5$4$!")32$%&'()*#+)",-)*,".;''-1)3/,#"01,/%!$)1*)1+$$$$$###$$$$*(:#,8'1**!",':+$*(<-#/="*&$(,$,&$-,$%&&+$9*'('-*4(>3+$$$###$$$$(#2,),*%&*,0$78),#$5'5!(/,)'3%+$$$$$$$$$$$$$$$$$$$$$$$$$$###$$$$1)/2-/:*?/-*)3*':#6:/+$-#97#)28,*5'5,',+$9*'('-*-#1+$$$$$###$$$$$$@*@6*''-1+$A*A6*''-1+$")B#*%).++$$$$$$$$$$$$$$$$$$$$$$$###$$$$(#2#31*")B#*$%)/++$$$$$$$$$$$$$$$$$$$$$$$$$$$$$$$$$$$$$$$###$$$$,),(#*%C#/3$9/:)(A$)3*':#$/31$5*,0$78),#$5'5!(/,)'3%+$$$$###$$$$"!<,),(#*%D/"8)32,'3$+0;0$EFGGGH%$%$%+! Pct. white population

(75,100](50,75](25,50][0,25]

Washington D.C. (2000)

Mean family income and pct. white population

Maurizio Pisati Exploratory spatial data analysis using Stata 35/91

Page 36: Exploratory spatial data analysis using Stata - DPCtypo3.dpc.de/fileadmin/templates/dpc-statistik/doc_pdf/stata_user... · Introduction Spatial data Visualizing spatial data Exploring

IntroductionSpatial data

Visualizing spatial dataExploring spatial point patternsDetecting spatial autocorrelation

References

OverviewDot mapsProportional symbol mapsDiagram mapsChoropleth mapsMultivariate maps

Multivariate maps: example 2

The map shows the relationship betweenpct. white population (represented byframed-rectangle charts), mean familyincome (represented by the width offramed-rectangle charts) and robberyrate (represented by shades of color)across the seven Police Districts ofWashington D.C. (2000/2009)

!"#$%&'()*#+)",-)*,".+/,/01,/%!$*(#/-$2#3#-/,#$4$"$5'5678),##5'5,',$%&&$9'-:/,$4$'()&9$"5:/5$4$!")32$%&'()*#+)",-)*,".;''-1)3/,#"01,/%!$)1*)1+$$$$$###$$$$*(:#,8'1**!",':+$*(<-#/="*&$(,$,&$-,$%&&+$9*'('-*4(>3+$$$###$$$$(#2,),*%&*,0$78),#$5'5!(/,)'3%+$$$$$$$$$$$$$$$$$$$$$$$$$$###$$$$1)/2-/:*?/-*)3*':#6:/+$-#97#)28,*5'5,',+$9*'('-*-#1+$$$$$###$$$$$$@*@6*''-1+$A*A6*''-1+$")B#*%).++$$$$$$$$$$$$$$$$$$$$$$$###$$$$(#2#31*")B#*$%)/++$$$$$$$$$$$$$$$$$$$$$$$$$$$$$$$$$$$$$$$###$$$$,),(#*%C#/3$9/:)(A$)3*':#$/31$5*,0$78),#$5'5!(/,)'3%+$$$$###$$$$"!<,),(#*%D/"8)32,'3$+0;0$EFGGGH%$%$%+!

Robberies per 1,000 pop.(10,12](8,10](6,8][2,6]

Washington D.C. (2000/2009)

Pct. white population, income and robberies

Maurizio Pisati Exploratory spatial data analysis using Stata 36/91

Page 37: Exploratory spatial data analysis using Stata - DPCtypo3.dpc.de/fileadmin/templates/dpc-statistik/doc_pdf/stata_user... · Introduction Spatial data Visualizing spatial data Exploring

IntroductionSpatial data

Visualizing spatial dataExploring spatial point patternsDetecting spatial autocorrelation

References

OverviewKernel density estimation

Exploring spatial point patterns

Maurizio Pisati Exploratory spatial data analysis using Stata 37/91

Page 38: Exploratory spatial data analysis using Stata - DPCtypo3.dpc.de/fileadmin/templates/dpc-statistik/doc_pdf/stata_user... · Introduction Spatial data Visualizing spatial data Exploring

IntroductionSpatial data

Visualizing spatial dataExploring spatial point patternsDetecting spatial autocorrelation

References

OverviewKernel density estimation

Two-dimensional spatial point patterns

• A two-dimensional spatial point pattern can bedefined as a set of N point spatial objects S located withina given study area A

• Usually, each point si ∈ S represents a real entity of somekind: people, events, sites, buildings, plants, cases of adisease, etc.

• Alternatively, each point si represents the centroid of aregion

• Points si are referred to as the data points

Maurizio Pisati Exploratory spatial data analysis using Stata 38/91

Page 39: Exploratory spatial data analysis using Stata - DPCtypo3.dpc.de/fileadmin/templates/dpc-statistik/doc_pdf/stata_user... · Introduction Spatial data Visualizing spatial data Exploring

IntroductionSpatial data

Visualizing spatial dataExploring spatial point patternsDetecting spatial autocorrelation

References

OverviewKernel density estimation

Two-dimensional spatial point patterns

• In the analysis of spatial point patterns, we are ofteninterested in determining whether the observed data pointsexhibit some form of clustering, as opposed to beingdistributed uniformly within A

• To explore the possibility of point clustering, it may beuseful to describe the spatial point pattern of interest bymeans of its probability density function p(s) and/or itsintensity function λ(s) (Waller and Gotway 2004)

Maurizio Pisati Exploratory spatial data analysis using Stata 39/91

Page 40: Exploratory spatial data analysis using Stata - DPCtypo3.dpc.de/fileadmin/templates/dpc-statistik/doc_pdf/stata_user... · Introduction Spatial data Visualizing spatial data Exploring

IntroductionSpatial data

Visualizing spatial dataExploring spatial point patternsDetecting spatial autocorrelation

References

OverviewKernel density estimation

Two-dimensional spatial point patterns

• The probability density function p(s) defines theprobability of observing an object per unit area at locations ∈ A

• The intensity function λ(s) defines the expected numberof objects per unit area at location s ∈ A

• The probability density function and the intensity functiondiffer only by a constant of proportionality

Maurizio Pisati Exploratory spatial data analysis using Stata 40/91

Page 41: Exploratory spatial data analysis using Stata - DPCtypo3.dpc.de/fileadmin/templates/dpc-statistik/doc_pdf/stata_user... · Introduction Spatial data Visualizing spatial data Exploring

IntroductionSpatial data

Visualizing spatial dataExploring spatial point patternsDetecting spatial autocorrelation

References

OverviewKernel density estimation

Kernel estimators

• Both the probability density function p(s) and the intensityfunction λ(s) of a two-dimensional spatial point patterncan be estimated by means of nonparametric estimators,e.g., kernel estimators (Waller and Gotway 2004)

• Kernel estimators are used to generate a spatiallysmooth estimate of p(s) and/or λ(s) at a fine grid of pointssg (g = 1, ..., G) covering the study area A

• In the context of spatial data analysis, a grid is a regulartessellation of the study area A that divides it into a set ofG contiguous cells whose centers are referred to as the gridpoints and denoted by sg

Maurizio Pisati Exploratory spatial data analysis using Stata 41/91

Page 42: Exploratory spatial data analysis using Stata - DPCtypo3.dpc.de/fileadmin/templates/dpc-statistik/doc_pdf/stata_user... · Introduction Spatial data Visualizing spatial data Exploring

IntroductionSpatial data

Visualizing spatial dataExploring spatial point patternsDetecting spatial autocorrelation

References

OverviewKernel density estimation

Kernel estimator of λ(s)

• The intensity λ(sg) at each grid point sg is estimated by:

λ(sg) =c

Ag

N∑i=1

k

(d(si, sg)

hi

)yi

where k(·) is the kernel function – usually a unimodalsymmetrical bivariate probability density function; hi is thekernel bandwidth, i.e., the radius of the kernel function;d(si, sg) is the Euclidean distance between data point siand grid point sg; yi is the value taken by an optionalvariable of interest Y at data point si; Ag is the area of theregion of A over which the kernel function is evaluated,possibly corrected for edge effects; and c is a constant ofproportionality

Maurizio Pisati Exploratory spatial data analysis using Stata 42/91

Page 43: Exploratory spatial data analysis using Stata - DPCtypo3.dpc.de/fileadmin/templates/dpc-statistik/doc_pdf/stata_user... · Introduction Spatial data Visualizing spatial data Exploring

IntroductionSpatial data

Visualizing spatial dataExploring spatial point patternsDetecting spatial autocorrelation

References

OverviewKernel density estimation

Kernel estimator of p(s)

• In turn, the probability density p(sg) at each grid point sgis estimated by:

p(sg) =λ(sg)

G∑j=1

λ(sj)

Maurizio Pisati Exploratory spatial data analysis using Stata 43/91

Page 44: Exploratory spatial data analysis using Stata - DPCtypo3.dpc.de/fileadmin/templates/dpc-statistik/doc_pdf/stata_user... · Introduction Spatial data Visualizing spatial data Exploring

IntroductionSpatial data

Visualizing spatial dataExploring spatial point patternsDetecting spatial autocorrelation

References

OverviewKernel density estimation

Kernel estimation in Stata

• Stata users can generate kernel estimates of the probabilitydensity function p(s) and the intensity function λ(s) usingtwo user-written commands freely available from the SSCArchive: spgrid and spkde

• spgrid (latest version: 1.0.1) generates several kinds oftwo-dimensional grids covering rectangular or irregularstudy areas

• spkde (latest version: 1.0.0) implements a variety of kernelestimators of p(s) and λ(s)

• spmap can then be used to visualize the kernel estimatesgenerated by spgrid and spkde

Maurizio Pisati Exploratory spatial data analysis using Stata 44/91

Page 45: Exploratory spatial data analysis using Stata - DPCtypo3.dpc.de/fileadmin/templates/dpc-statistik/doc_pdf/stata_user... · Introduction Spatial data Visualizing spatial data Exploring

IntroductionSpatial data

Visualizing spatial dataExploring spatial point patternsDetecting spatial autocorrelation

References

OverviewKernel density estimation

Kernel estimation: example

Our purpose is to estimate the probability density function ofa set of 139 points representing the homicides committed inWashington D.C. in 2009

Maurizio Pisati Exploratory spatial data analysis using Stata 45/91

Page 46: Exploratory spatial data analysis using Stata - DPCtypo3.dpc.de/fileadmin/templates/dpc-statistik/doc_pdf/stata_user... · Introduction Spatial data Visualizing spatial data Exploring

IntroductionSpatial data

Visualizing spatial dataExploring spatial point patternsDetecting spatial autocorrelation

References

OverviewKernel density estimation

Kernel estimation: example

Step 1

We use spgrid to generate a gridcovering the area of Washington D.C. Wechoose a relatively fine grid resolution(grid cell width = 200 meters). spmap isused to display the grid

!"#$%&'(!%)#'*+,()&-$%.!/&0-*!'$.!,1(0%,)"2344#''''$$$''''&,0!'5,6"$.!!'()%0"6.0.$!#'5.11!"*50.6"/&0-*#'''$$$''''",%)0!"*"0.6"/&0-*#'$."1-5.''(!.'*"0.6"/&0-*!'51.-$'!"6-"'(!%)#'*50.6"/&0-*!'%&"!"#$%&7%&#!

Maurizio Pisati Exploratory spatial data analysis using Stata 46/91

Page 47: Exploratory spatial data analysis using Stata - DPCtypo3.dpc.de/fileadmin/templates/dpc-statistik/doc_pdf/stata_user... · Introduction Spatial data Visualizing spatial data Exploring

IntroductionSpatial data

Visualizing spatial dataExploring spatial point patternsDetecting spatial autocorrelation

References

OverviewKernel density estimation

Kernel estimation: example

Step 2

We use spkde to generate kernelestimates of the probability distributionof homicides in Washington D.C. Wechoose a quartic kernel function withfixed bandwidth equal to 1,000 metersand edge correction. spmap is used todisplay the results

!"#$%&'()#*++,-./0%!$12#0'$3##4$(5$655#7"#""#$"43.#$!"(78$%4/#)4-./0%!$9$9:166'.%$;$;:166'.%$$$&&&$$$$3#'7#2$<!0'/(1%$=07.>(./?$5=>%$5=>$'(((%$$$$$$&&&$$$$#.8#16''#1/$.6/"$"0@(78$%3.#-./0%!$'#4201#%$$!"#$%3.#-./0%!$12#0'$"4)04$4$!"(78$%1/#)4-./0%!$(.$"48'(.:(.%$12)#/?6.$<!07/(2#%$&&&$$$$127!)=#'$)(%$51626'$A0(7=6>%$61626'$767#$**%$2#8#7.$655%$&&&$$$$/(/2#$%B6)(1(.#"%!$"(C#$+'*)%%$$$$$$$$$$$$$$$$$$$$$$$$$$$&&&$$$$"!=/(/2#$%D0"?(78/67$E-&-$F*++,G%$%$%!$"(C#$+'*)%%!

Washington D.C. (2009)

Homicides

Maurizio Pisati Exploratory spatial data analysis using Stata 47/91

Page 48: Exploratory spatial data analysis using Stata - DPCtypo3.dpc.de/fileadmin/templates/dpc-statistik/doc_pdf/stata_user... · Introduction Spatial data Visualizing spatial data Exploring

IntroductionSpatial data

Visualizing spatial dataExploring spatial point patternsDetecting spatial autocorrelation

References

OverviewKernel density estimation

Kernel estimation: example

Step 2

We use spkde to generate kernelestimates of the probability distributionof homicides in Washington D.C. Wechoose a quartic kernel function withfixed bandwidth equal to 1,000 metersand edge correction. spmap is used todisplay the results

!"#$%&'()#*++,-./0%!$12#0'$3##4$(5$655#7"#""#$"43.#$!"(78$%4/#)4-./0%!$9$9:166'.%$;$;:166'.%$$$&&&$$$$3#'7#2$<!0'/(1%$=07.>(./?$5=>%$5=>$'(((%$$$$$$&&&$$$$#.8#16''#1/$.6/"$"0@(78$%3.#-./0%!$'#4201#%$$!"#$%3.#-./0%!$12#0'$"4)04$4$!"(78$%1/#)4-./0%!$(.$"48'(.:(.%$12)#/?6.$<!07/(2#%$&&&$$$$127!)=#'$)(%$51626'$A0(7=6>%$61626'$767#$**%$2#8#7.$655%$&&&$$$$/(/2#$%B6)(1(.#"%!$"(C#$+'*)%%$$$$$$$$$$$$$$$$$$$$$$$$$$$&&&$$$$"!=/(/2#$%D0"?(78/67$E-&-$F*++,G%$%$%!$"(C#$+'*)%%!

Washington D.C. (2009)

Homicides

Maurizio Pisati Exploratory spatial data analysis using Stata 48/91

Page 49: Exploratory spatial data analysis using Stata - DPCtypo3.dpc.de/fileadmin/templates/dpc-statistik/doc_pdf/stata_user... · Introduction Spatial data Visualizing spatial data Exploring

IntroductionSpatial data

Visualizing spatial dataExploring spatial point patternsDetecting spatial autocorrelation

References

OverviewSpatial weights matricesMeasuring spatial autocorrelationGlobal indices of spatial autocorrelationLocal indices of spatial autocorrelation

Detecting spatial autocorrelation

Maurizio Pisati Exploratory spatial data analysis using Stata 49/91

Page 50: Exploratory spatial data analysis using Stata - DPCtypo3.dpc.de/fileadmin/templates/dpc-statistik/doc_pdf/stata_user... · Introduction Spatial data Visualizing spatial data Exploring

IntroductionSpatial data

Visualizing spatial dataExploring spatial point patternsDetecting spatial autocorrelation

References

OverviewSpatial weights matricesMeasuring spatial autocorrelationGlobal indices of spatial autocorrelationLocal indices of spatial autocorrelation

Spatial autocorrelation

• Forty years ago, the geographer and statistician WaldoTobler formulated the first law of geography : “Everythingis related to everything else, but near things are morerelated than distant things” (Tobler 1970: 234)

• This “law” defines the statistical concept of (positive)spatial autocorrelation, according to which two or moreobjects that are spatially close tend to be more similar toeach other – with respect to a given attribute Y – than arespatially distant objects

• In general, spatial autocorrelation implies spatialcustering, i.e., the existence of sub-areas of the study areawhere the attribute of interest Y takes higher than averagevalues (hot spots) or lower than average values (cold spots)

Maurizio Pisati Exploratory spatial data analysis using Stata 50/91

Page 51: Exploratory spatial data analysis using Stata - DPCtypo3.dpc.de/fileadmin/templates/dpc-statistik/doc_pdf/stata_user... · Introduction Spatial data Visualizing spatial data Exploring

IntroductionSpatial data

Visualizing spatial dataExploring spatial point patternsDetecting spatial autocorrelation

References

OverviewSpatial weights matricesMeasuring spatial autocorrelationGlobal indices of spatial autocorrelationLocal indices of spatial autocorrelation

Spatial weights matrix

• The analysis of spatial autocorrelation requires themeasurement of the degree of spatial proximity amongthe spatial objects of interest

• Typically, the degree of spatial proximity among a givenset of N spatial objects is represented by a N ×N matrixcalled spatial weights matrix and denoted by W

• Each element (i, j) of W – which we denote by wij –expresses the degree of spatial proximity between the pairof objects i and j

• Depending on the application, the N main diagonalelements of W are assigned value wii = 0 or value wii > 0

Maurizio Pisati Exploratory spatial data analysis using Stata 51/91

Page 52: Exploratory spatial data analysis using Stata - DPCtypo3.dpc.de/fileadmin/templates/dpc-statistik/doc_pdf/stata_user... · Introduction Spatial data Visualizing spatial data Exploring

IntroductionSpatial data

Visualizing spatial dataExploring spatial point patternsDetecting spatial autocorrelation

References

OverviewSpatial weights matricesMeasuring spatial autocorrelationGlobal indices of spatial autocorrelationLocal indices of spatial autocorrelation

Spatial weights matrix

• A common variant of W is the row-standardized spatialweights matrix Wstd, whose elements are defined asfollows:

wstdij =

wij

N∑j=1

wij

Maurizio Pisati Exploratory spatial data analysis using Stata 52/91

Page 53: Exploratory spatial data analysis using Stata - DPCtypo3.dpc.de/fileadmin/templates/dpc-statistik/doc_pdf/stata_user... · Introduction Spatial data Visualizing spatial data Exploring

IntroductionSpatial data

Visualizing spatial dataExploring spatial point patternsDetecting spatial autocorrelation

References

OverviewSpatial weights matricesMeasuring spatial autocorrelationGlobal indices of spatial autocorrelationLocal indices of spatial autocorrelation

Spatial weights matrices in Stata

• Stata users can generate several kinds of spatial weightsmatrices using spatwmat, a user-written commandpublished in the Stata Technical Bulletin (Pisati 2001)

• spatwmat (latest version: 1.0) imports or generates fromscratch the spatial weights matrices required by thecommands spatgsa and spatlsa described below

Maurizio Pisati Exploratory spatial data analysis using Stata 53/91

Page 54: Exploratory spatial data analysis using Stata - DPCtypo3.dpc.de/fileadmin/templates/dpc-statistik/doc_pdf/stata_user... · Introduction Spatial data Visualizing spatial data Exploring

IntroductionSpatial data

Visualizing spatial dataExploring spatial point patternsDetecting spatial autocorrelation

References

OverviewSpatial weights matricesMeasuring spatial autocorrelationGlobal indices of spatial autocorrelationLocal indices of spatial autocorrelation

Indices of spatial autocorrelation

• We consider measures of spatial autocorrelation that applyto area data

• Measures of spatial autocorrelation can be classified intotwo broad categories:

• Global indices of spatial autocorrelation• Local indices of spatial autocorrelation

Maurizio Pisati Exploratory spatial data analysis using Stata 54/91

Page 55: Exploratory spatial data analysis using Stata - DPCtypo3.dpc.de/fileadmin/templates/dpc-statistik/doc_pdf/stata_user... · Introduction Spatial data Visualizing spatial data Exploring

IntroductionSpatial data

Visualizing spatial dataExploring spatial point patternsDetecting spatial autocorrelation

References

OverviewSpatial weights matricesMeasuring spatial autocorrelationGlobal indices of spatial autocorrelationLocal indices of spatial autocorrelation

Global indices of spatial autocorrelation

• A global index of spatial autocorrelation expressesthe overall degree of similarity between spatially closeregions observed in a given study area A with respect to anumeric variable Y (Pfeiffer et al. 2008)

• Since global indices of spatial autocorrelation summarizethe phenomenon of interest in a single value, they areintended not so much for identifying specific spatialclusters, as for detecting the presence of a general tendencyto clustering within the study area

Maurizio Pisati Exploratory spatial data analysis using Stata 55/91

Page 56: Exploratory spatial data analysis using Stata - DPCtypo3.dpc.de/fileadmin/templates/dpc-statistik/doc_pdf/stata_user... · Introduction Spatial data Visualizing spatial data Exploring

IntroductionSpatial data

Visualizing spatial dataExploring spatial point patternsDetecting spatial autocorrelation

References

OverviewSpatial weights matricesMeasuring spatial autocorrelationGlobal indices of spatial autocorrelationLocal indices of spatial autocorrelation

Global indices of spatial autocorrelation

• In general, the computation of a global index of spatialautocorrelation follows a three-step procedure:

• First, we compute the degree of similarity ρij between everypossible pair or regions ri and rj with respect to thenumeric variable of interest Y

• Second, we weight – i.e., multiply – each value ρij by thedegree of proximity wij between regions ri and rj

• Finally, we sum up all the products wijρij and divide thetotal by a constant of proportionality

• The greater the number of regions that are similar withrespect to Y and spatially close, the greater the valuetaken by the global index of spatial autocorrelation

Maurizio Pisati Exploratory spatial data analysis using Stata 56/91

Page 57: Exploratory spatial data analysis using Stata - DPCtypo3.dpc.de/fileadmin/templates/dpc-statistik/doc_pdf/stata_user... · Introduction Spatial data Visualizing spatial data Exploring

IntroductionSpatial data

Visualizing spatial dataExploring spatial point patternsDetecting spatial autocorrelation

References

OverviewSpatial weights matricesMeasuring spatial autocorrelationGlobal indices of spatial autocorrelationLocal indices of spatial autocorrelation

Moran’s I

• Moran’s global index of spatial autocorrelation I (Moran1948) defines ρij as (yi − y)(yj − y), where yi is the valuetaken by Y in region ri, yj is the value taken by Y inregion rj , and y is the average value of Y :

I =

N∑i=1

N∑j=1

wij(yi − y)(yj − y)

1N

N∑i=1

(yi − y)2N∑i=1

N∑j=1

wij

where wii = 0

Maurizio Pisati Exploratory spatial data analysis using Stata 57/91

Page 58: Exploratory spatial data analysis using Stata - DPCtypo3.dpc.de/fileadmin/templates/dpc-statistik/doc_pdf/stata_user... · Introduction Spatial data Visualizing spatial data Exploring

IntroductionSpatial data

Visualizing spatial dataExploring spatial point patternsDetecting spatial autocorrelation

References

OverviewSpatial weights matricesMeasuring spatial autocorrelationGlobal indices of spatial autocorrelationLocal indices of spatial autocorrelation

Moran’s I

• Under the null hypothesis of no global spatial auto-correlation, the expected value of I is:

E(I) = − 1

N − 1

• I > E(I) indicates positive spatial autocorrelation –nearby regions tend to exhibit similar values of Y

• I < E(I) indicates negative spatial autocorrelation –nearby regions tend to exhibit dissimilar values of Y

Maurizio Pisati Exploratory spatial data analysis using Stata 58/91

Page 59: Exploratory spatial data analysis using Stata - DPCtypo3.dpc.de/fileadmin/templates/dpc-statistik/doc_pdf/stata_user... · Introduction Spatial data Visualizing spatial data Exploring

IntroductionSpatial data

Visualizing spatial dataExploring spatial point patternsDetecting spatial autocorrelation

References

OverviewSpatial weights matricesMeasuring spatial autocorrelationGlobal indices of spatial autocorrelationLocal indices of spatial autocorrelation

Getis and Ord’s G

• Getis and Ord’s global index of spatial autocorrelation G(Getis and Ord 1992) defines ρij as yiyj :

G =

N∑i=1

N∑j=1

wijyiyj

N∑i=1

N∑j=1

yiyj

(i 6= j)

where wij denotes the elements of a non-standardizedbinary spatial weights matrix, with wii = 0

• G is defined only for Y > 0

Maurizio Pisati Exploratory spatial data analysis using Stata 59/91

Page 60: Exploratory spatial data analysis using Stata - DPCtypo3.dpc.de/fileadmin/templates/dpc-statistik/doc_pdf/stata_user... · Introduction Spatial data Visualizing spatial data Exploring

IntroductionSpatial data

Visualizing spatial dataExploring spatial point patternsDetecting spatial autocorrelation

References

OverviewSpatial weights matricesMeasuring spatial autocorrelationGlobal indices of spatial autocorrelationLocal indices of spatial autocorrelation

Getis and Ord’s G

• Under the null hypothesis of no global spatial auto-correlation, the expected value of G is:

E(G) =

N∑i=1

N∑j=1

wij

N(N − 1)

• G > E(G) indicates (a) positive spatial autocorrelation,and (b) prevalence of spatial clusters with relatively highvalues of Y

• G < E(G) indicates (a) positive spatial autocorrelation,and (b) prevalence of spatial clusters with relatively lowvalues of Y

Maurizio Pisati Exploratory spatial data analysis using Stata 60/91

Page 61: Exploratory spatial data analysis using Stata - DPCtypo3.dpc.de/fileadmin/templates/dpc-statistik/doc_pdf/stata_user... · Introduction Spatial data Visualizing spatial data Exploring

IntroductionSpatial data

Visualizing spatial dataExploring spatial point patternsDetecting spatial autocorrelation

References

OverviewSpatial weights matricesMeasuring spatial autocorrelationGlobal indices of spatial autocorrelationLocal indices of spatial autocorrelation

Global indices of spatial autocorrelation in Stata

• Stata users can compute global indices of spatial auto-correlation using spatgsa, a user-written commandpublished in the Stata Technical Bulletin (Pisati 2001)

• spatgsa (latest version: 1.0) computes three global indicesof spatial autocorrelation: Moran’s I, Getis and Ord’s G,and Geary’s c. For each index and each numeric variable ofinterest, spatgsa computes and displays in tabular formthe value of the index itself, the expected value of the indexunder the null hypothesis of no global spatial auto-correlation, the standard deviation of the index, thez -value, and the corresponding one- or two-tailed p-value

Maurizio Pisati Exploratory spatial data analysis using Stata 61/91

Page 62: Exploratory spatial data analysis using Stata - DPCtypo3.dpc.de/fileadmin/templates/dpc-statistik/doc_pdf/stata_user... · Introduction Spatial data Visualizing spatial data Exploring

IntroductionSpatial data

Visualizing spatial dataExploring spatial point patternsDetecting spatial autocorrelation

References

OverviewSpatial weights matricesMeasuring spatial autocorrelationGlobal indices of spatial autocorrelationLocal indices of spatial autocorrelation

Global indices of spatial autocorrelation: example

• Study area: Ohio

• Regions: 88 counties

• Variables of interest:• Pct. population aged 18+ with poor-to-fair health status

(pct poorhealth)• Pct. population aged 18+ currently smoking

(pct currsmoker)• Pct. population aged 18+ ever diagnosed with high blood

pressure (pct hibloodprs)• Pct. population aged 18+ obese (pct obese)

Maurizio Pisati Exploratory spatial data analysis using Stata 62/91

Page 63: Exploratory spatial data analysis using Stata - DPCtypo3.dpc.de/fileadmin/templates/dpc-statistik/doc_pdf/stata_user... · Introduction Spatial data Visualizing spatial data Exploring

IntroductionSpatial data

Visualizing spatial dataExploring spatial point patternsDetecting spatial autocorrelation

References

OverviewSpatial weights matricesMeasuring spatial autocorrelationGlobal indices of spatial autocorrelationLocal indices of spatial autocorrelation

Global indices of spatial autocorrelation: example

Step 1

We use spatwmat to import an existing binary spatial weights matrix– stored in the Stata dataset Counties-Contiguity.dta – andconvert it into a properly formatted row-standardized spatial weightsmatrix Ws

!"#$%&#$'(!)*+',-.(*$)/!0-.*$)+()$123$#,!'''"""''''*#&/#4!$'!$#*3#53)6/!

Maurizio Pisati Exploratory spatial data analysis using Stata 63/91

Page 64: Exploratory spatial data analysis using Stata - DPCtypo3.dpc.de/fileadmin/templates/dpc-statistik/doc_pdf/stata_user... · Introduction Spatial data Visualizing spatial data Exploring

IntroductionSpatial data

Visualizing spatial dataExploring spatial point patternsDetecting spatial autocorrelation

References

OverviewSpatial weights matricesMeasuring spatial autocorrelationGlobal indices of spatial autocorrelationLocal indices of spatial autocorrelation

Global indices of spatial autocorrelation: example

domenica 27 maggio 2012 20.48 Page 1

User: Maurizio

The following matrix has been created:

1. Imported binary weights matrix Ws (row-standardized)

Dimension: 88x88

Maurizio Pisati Exploratory spatial data analysis using Stata 64/91

Page 65: Exploratory spatial data analysis using Stata - DPCtypo3.dpc.de/fileadmin/templates/dpc-statistik/doc_pdf/stata_user... · Introduction Spatial data Visualizing spatial data Exploring

IntroductionSpatial data

Visualizing spatial dataExploring spatial point patternsDetecting spatial autocorrelation

References

OverviewSpatial weights matricesMeasuring spatial autocorrelationGlobal indices of spatial autocorrelationLocal indices of spatial autocorrelation

Global indices of spatial autocorrelation: example

Step 2

We use spatgsa with the spatial weights matrix Ws to computeMoran’s I on the variables of interest

!"#$%&'!()*#"+,-)-./)-%!$01#-2$"3-)4"-$30)53''26#-1)6$30)50!22"7'8#2$30)56*91''/32"$$$"""$$$$30)5'9#"#!$:#;"$$7'2-(!

Maurizio Pisati Exploratory spatial data analysis using Stata 65/91

Page 66: Exploratory spatial data analysis using Stata - DPCtypo3.dpc.de/fileadmin/templates/dpc-statistik/doc_pdf/stata_user... · Introduction Spatial data Visualizing spatial data Exploring

IntroductionSpatial data

Visualizing spatial dataExploring spatial point patternsDetecting spatial autocorrelation

References

OverviewSpatial weights matricesMeasuring spatial autocorrelationGlobal indices of spatial autocorrelationLocal indices of spatial autocorrelation

Global indices of spatial autocorrelation: example

domenica 27 maggio 2012 21.08 Page 1

User: Maurizio

Measures of global spatial autocorrelation

Weights matrix

Name: Ws

Type: Imported (binary)

Row-standardized: Yes

Moran's I

Variables I E(I) sd(I) z p-value*

pct_poorhealth 0.399 -0.011 0.065 6.337 0.000

pct_currsmoker 0.339 -0.011 0.065 5.367 0.000

pct_hibloodprs 0.126 -0.011 0.065 2.119 0.017

pct_obese 0.167 -0.011 0.065 2.730 0.003

*1-tail test

Maurizio Pisati Exploratory spatial data analysis using Stata 66/91

Page 67: Exploratory spatial data analysis using Stata - DPCtypo3.dpc.de/fileadmin/templates/dpc-statistik/doc_pdf/stata_user... · Introduction Spatial data Visualizing spatial data Exploring

IntroductionSpatial data

Visualizing spatial dataExploring spatial point patternsDetecting spatial autocorrelation

References

OverviewSpatial weights matricesMeasuring spatial autocorrelationGlobal indices of spatial autocorrelationLocal indices of spatial autocorrelation

Global indices of spatial autocorrelation: example

Step 3

We use again spatwmat to import the binary spatial weights matrixstored in the Stata dataset Counties-Contiguity.dta. This time,however, we convert it into a properly formatted non-standardizedspatial weights matrix W

!"#$%&#$'(!)*+',-.(*$)/!0-.*$)+()$123$#,!'''"""''''*#&/#4$!

Maurizio Pisati Exploratory spatial data analysis using Stata 67/91

Page 68: Exploratory spatial data analysis using Stata - DPCtypo3.dpc.de/fileadmin/templates/dpc-statistik/doc_pdf/stata_user... · Introduction Spatial data Visualizing spatial data Exploring

IntroductionSpatial data

Visualizing spatial dataExploring spatial point patternsDetecting spatial autocorrelation

References

OverviewSpatial weights matricesMeasuring spatial autocorrelationGlobal indices of spatial autocorrelationLocal indices of spatial autocorrelation

Global indices of spatial autocorrelation: example

domenica 27 maggio 2012 21.39 Page 1

User: Maurizio

The following matrix has been created:

1. Imported binary weights matrix W

Dimension: 88x88

Maurizio Pisati Exploratory spatial data analysis using Stata 68/91

Page 69: Exploratory spatial data analysis using Stata - DPCtypo3.dpc.de/fileadmin/templates/dpc-statistik/doc_pdf/stata_user... · Introduction Spatial data Visualizing spatial data Exploring

IntroductionSpatial data

Visualizing spatial dataExploring spatial point patternsDetecting spatial autocorrelation

References

OverviewSpatial weights matricesMeasuring spatial autocorrelationGlobal indices of spatial autocorrelationLocal indices of spatial autocorrelation

Global indices of spatial autocorrelation: example

Step 4

We use spatgsa with the spatial weights matrix W to compute Getisand Ord’s G on the variables of interest

!"#$%&'!()*#"+,-)-./)-%!$01#-2$"3-)4"-$30)53''26#-1)6$30)50!22"7'8#2$30)56*91''/32"$$$"""$$$$30)5'9#"#!$:#;$$4'!

Maurizio Pisati Exploratory spatial data analysis using Stata 69/91

Page 70: Exploratory spatial data analysis using Stata - DPCtypo3.dpc.de/fileadmin/templates/dpc-statistik/doc_pdf/stata_user... · Introduction Spatial data Visualizing spatial data Exploring

IntroductionSpatial data

Visualizing spatial dataExploring spatial point patternsDetecting spatial autocorrelation

References

OverviewSpatial weights matricesMeasuring spatial autocorrelationGlobal indices of spatial autocorrelationLocal indices of spatial autocorrelation

Global indices of spatial autocorrelation: example

domenica 27 maggio 2012 21.40 Page 1

User: Maurizio

Measures of global spatial autocorrelation

Weights matrix

Name: WType: Imported (binary)Row-standardized: No

Getis & Ord's G

Variables G E(G) sd(G) z p-value*

pct_poorhealth 0.061 0.060 0.001 0.372 0.355

pct_currsmoker 0.060 0.060 0.001 -0.392 0.348

pct_hibloodprs 0.060 0.060 0.000 -1.585 0.056

pct_obese 0.061 0.060 0.000 0.458 0.323

*1-tail test

Maurizio Pisati Exploratory spatial data analysis using Stata 70/91

Page 71: Exploratory spatial data analysis using Stata - DPCtypo3.dpc.de/fileadmin/templates/dpc-statistik/doc_pdf/stata_user... · Introduction Spatial data Visualizing spatial data Exploring

IntroductionSpatial data

Visualizing spatial dataExploring spatial point patternsDetecting spatial autocorrelation

References

OverviewSpatial weights matricesMeasuring spatial autocorrelationGlobal indices of spatial autocorrelationLocal indices of spatial autocorrelation

Local indices of spatial autocorrelation

• A local index of spatial autocorrelation expresses, foreach region ri of a given study area A, the degree ofsimilarity between that region and its neighboring regionswith respect to a numeric variable Y (Pfeiffer et al. 2008)

• The local indices of spatial autocorrelation considered hereare derived from the two global indices described above –Moran’s I and Getis and Ord’s G – and, therefore, sharetheir fundamental properties

Maurizio Pisati Exploratory spatial data analysis using Stata 71/91

Page 72: Exploratory spatial data analysis using Stata - DPCtypo3.dpc.de/fileadmin/templates/dpc-statistik/doc_pdf/stata_user... · Introduction Spatial data Visualizing spatial data Exploring

IntroductionSpatial data

Visualizing spatial dataExploring spatial point patternsDetecting spatial autocorrelation

References

OverviewSpatial weights matricesMeasuring spatial autocorrelationGlobal indices of spatial autocorrelationLocal indices of spatial autocorrelation

Moran’s Ii

• Moran’s local index of spatial autocorrelation Ii is definedas follows:

Ii =

N∑j=1

wstdij

(yi − yσy

)(yj − yσy

)where σy denotes the standard deviation of Y ; and wstd

ij

denotes the elements of a row-standardized spatial weightsmatrix, with wstd

ii = 0

Maurizio Pisati Exploratory spatial data analysis using Stata 72/91

Page 73: Exploratory spatial data analysis using Stata - DPCtypo3.dpc.de/fileadmin/templates/dpc-statistik/doc_pdf/stata_user... · Introduction Spatial data Visualizing spatial data Exploring

IntroductionSpatial data

Visualizing spatial dataExploring spatial point patternsDetecting spatial autocorrelation

References

OverviewSpatial weights matricesMeasuring spatial autocorrelationGlobal indices of spatial autocorrelationLocal indices of spatial autocorrelation

Moran’s Ii

• Ii > E(Ii) indicates that region ri is surrounded by regionsthat, on average, are similar to ri with respect to Y(positive spatial autocorrelation). Moreover:

• If (yi − y) > 0, then ri represents a hot spot• If (yi − y) < 0, then ri represents a cold spot

• Ii < E(Ii) indicates that region ri is surrounded by regionsthat, on average, are different from ri with respect to Y(negative spatial autocorrelation)

Maurizio Pisati Exploratory spatial data analysis using Stata 73/91

Page 74: Exploratory spatial data analysis using Stata - DPCtypo3.dpc.de/fileadmin/templates/dpc-statistik/doc_pdf/stata_user... · Introduction Spatial data Visualizing spatial data Exploring

IntroductionSpatial data

Visualizing spatial dataExploring spatial point patternsDetecting spatial autocorrelation

References

OverviewSpatial weights matricesMeasuring spatial autocorrelationGlobal indices of spatial autocorrelationLocal indices of spatial autocorrelation

Getis and Ord’s Gi

• Getis and Ord’s local index of spatial autocorrelation Gi

is defined as follows:

Gi =

N∑j=1

wijyj

N∑j=1

yj

(i 6= j)

where wij denotes the elements of a non-standardizedbinary spatial weights matrix, with wii = 0

Maurizio Pisati Exploratory spatial data analysis using Stata 74/91

Page 75: Exploratory spatial data analysis using Stata - DPCtypo3.dpc.de/fileadmin/templates/dpc-statistik/doc_pdf/stata_user... · Introduction Spatial data Visualizing spatial data Exploring

IntroductionSpatial data

Visualizing spatial dataExploring spatial point patternsDetecting spatial autocorrelation

References

OverviewSpatial weights matricesMeasuring spatial autocorrelationGlobal indices of spatial autocorrelationLocal indices of spatial autocorrelation

Getis and Ord’s G?i

• Getis and Ord’s local index of spatial autocorrelationGi

* is defined as follows:

G?i =

N∑j=1

wijyj

N∑j=1

yj

where wij denotes the elements of a non-standardizedbinary spatial weights matrix, with wii = 1

Maurizio Pisati Exploratory spatial data analysis using Stata 75/91

Page 76: Exploratory spatial data analysis using Stata - DPCtypo3.dpc.de/fileadmin/templates/dpc-statistik/doc_pdf/stata_user... · Introduction Spatial data Visualizing spatial data Exploring

IntroductionSpatial data

Visualizing spatial dataExploring spatial point patternsDetecting spatial autocorrelation

References

OverviewSpatial weights matricesMeasuring spatial autocorrelationGlobal indices of spatial autocorrelationLocal indices of spatial autocorrelation

Getis and Ord’s Gi and G?i

• Like G, Gi and G?i can identify only regions characterized

by positive spatial autocorrelation

• If Gi > E(Gi) or G?i > E(G?

i ), then region ri represents ahot spot

• If Gi < E(Gi) or G?i < E(G?

i ), then region ri represents acold spot

Maurizio Pisati Exploratory spatial data analysis using Stata 76/91

Page 77: Exploratory spatial data analysis using Stata - DPCtypo3.dpc.de/fileadmin/templates/dpc-statistik/doc_pdf/stata_user... · Introduction Spatial data Visualizing spatial data Exploring

IntroductionSpatial data

Visualizing spatial dataExploring spatial point patternsDetecting spatial autocorrelation

References

OverviewSpatial weights matricesMeasuring spatial autocorrelationGlobal indices of spatial autocorrelationLocal indices of spatial autocorrelation

Local indices of spatial autocorrelation in Stata

• Stata users can compute local indices of spatial auto-correlation using spatlsa, a user-written commandpublished in the Stata Technical Bulletin (Pisati 2001)

• spatlsa (latest version: 1.0) computes four indices ofspatial autocorrelation: Moran’s Ii, Getis and Ord’s Gi andG?

i , and Geary’s ci. For each index and each region in theanalysis, spatlsa computes and displays in tabular formthe value of the index itself, the expected value of the indexunder the null hypothesis of no local spatial auto-correlation, the standard deviation of the index, thez -value, and the corresponding one- or two-tailed p-value

Maurizio Pisati Exploratory spatial data analysis using Stata 77/91

Page 78: Exploratory spatial data analysis using Stata - DPCtypo3.dpc.de/fileadmin/templates/dpc-statistik/doc_pdf/stata_user... · Introduction Spatial data Visualizing spatial data Exploring

IntroductionSpatial data

Visualizing spatial dataExploring spatial point patternsDetecting spatial autocorrelation

References

OverviewSpatial weights matricesMeasuring spatial autocorrelationGlobal indices of spatial autocorrelationLocal indices of spatial autocorrelation

Local indices of spatial autocorrelation: example

• Study area: Ohio

• Regions: 88 counties

• Variable of interest: Pct. population aged 18+ withpoor-to-fair health status (pct poorhealth)

Maurizio Pisati Exploratory spatial data analysis using Stata 78/91

Page 79: Exploratory spatial data analysis using Stata - DPCtypo3.dpc.de/fileadmin/templates/dpc-statistik/doc_pdf/stata_user... · Introduction Spatial data Visualizing spatial data Exploring

IntroductionSpatial data

Visualizing spatial dataExploring spatial point patternsDetecting spatial autocorrelation

References

OverviewSpatial weights matricesMeasuring spatial autocorrelationGlobal indices of spatial autocorrelationLocal indices of spatial autocorrelation

Local indices of spatial autocorrelation: example

Step 1

We use spatlsa with the standardized spatial weights matrix Ws –previously generated by spatwmat – to compute Moran’s Ii on thevariable of interest. In the output, counties are sorted by z -value

!"#$%&#$'(!)*+',-.(*$)/!0-.*$)+()$123$#,!'*#&/"4!#'!$#*3#53)6/'(!/',-.(*$)/!07#$#23$#,!'89/#5'!"#$9!#'"8$:"..5;/#9$;!'%"4!#'&.5#*')3"*#&/#'!.5$!

Maurizio Pisati Exploratory spatial data analysis using Stata 79/91

Page 80: Exploratory spatial data analysis using Stata - DPCtypo3.dpc.de/fileadmin/templates/dpc-statistik/doc_pdf/stata_user... · Introduction Spatial data Visualizing spatial data Exploring

IntroductionSpatial data

Visualizing spatial dataExploring spatial point patternsDetecting spatial autocorrelation

References

OverviewSpatial weights matricesMeasuring spatial autocorrelationGlobal indices of spatial autocorrelationLocal indices of spatial autocorrelation

Local indices of spatial autocorrelation: example

lunedì 28 maggio 2012 17.22 Page 1

User: Maurizio

Measures of local spatial autocorrelation

(Output omitted)

Moran's Ii (Poor-to-fair health status (pct. pop 18+))

name Ii E(Ii) sd(Ii) z p-value*

Knox -0.816 -0.011 0.358 -2.246 0.012

Hardin -0.760 -0.011 0.358 -2.090 0.018

Paulding -1.089 -0.011 0.560 -1.924 0.027

Licking -0.457 -0.011 0.358 -1.244 0.107

(Output omitted)

Hancock 0.545 -0.011 0.358 1.555 0.060

Williams 0.927 -0.011 0.560 1.675 0.047

Delaware 0.677 -0.011 0.389 1.769 0.038

Mercer 0.908 -0.011 0.482 1.906 0.028

Putnam 0.949 -0.011 0.358 2.682 0.004

Henry 1.087 -0.011 0.358 3.067 0.001

Vinton 1.246 -0.011 0.389 3.230 0.001

Gallia 2.433 -0.011 0.482 5.069 0.000

Pike 2.197 -0.011 0.429 5.150 0.000

Adams 3.578 -0.011 0.482 7.442 0.000

Lawrence 5.503 -0.011 0.560 9.844 0.000

Jackson 3.911 -0.011 0.389 10.077 0.000

Scioto 5.400 -0.011 0.482 11.220 0.000

*1-tail test

Maurizio Pisati Exploratory spatial data analysis using Stata 80/91

Page 81: Exploratory spatial data analysis using Stata - DPCtypo3.dpc.de/fileadmin/templates/dpc-statistik/doc_pdf/stata_user... · Introduction Spatial data Visualizing spatial data Exploring

IntroductionSpatial data

Visualizing spatial dataExploring spatial point patternsDetecting spatial autocorrelation

References

OverviewSpatial weights matricesMeasuring spatial autocorrelationGlobal indices of spatial autocorrelationLocal indices of spatial autocorrelation

Local indices of spatial autocorrelation: example

Step 2

We use Stata command genmsp – a simple wrapper to spatlsa

(source code in Appendix) – to generate the variables required fordrawing the Moran scatterplot and the corresponding cluster map(Anselin 1995)

!"#$%&'&()*&++,-"./)-!'0"1%#!

Maurizio Pisati Exploratory spatial data analysis using Stata 81/91

Page 82: Exploratory spatial data analysis using Stata - DPCtypo3.dpc.de/fileadmin/templates/dpc-statistik/doc_pdf/stata_user... · Introduction Spatial data Visualizing spatial data Exploring

IntroductionSpatial data

Visualizing spatial dataExploring spatial point patternsDetecting spatial autocorrelation

References

OverviewSpatial weights matricesMeasuring spatial autocorrelationGlobal indices of spatial autocorrelationLocal indices of spatial autocorrelation

Local indices of spatial autocorrelation: example

Step 3

We use graph twoway to draw the Moran scatterplot

!"#$%&'()(#*&&&&&&&&&&&&&&&&&&&&&&&&&&&&&&&&&&&&&&&&&&&&&!!!&&&&"+,#''-"&.+'/0$,'0$))"%-#1'%&+'/0$,'0$))"%-#1'%&&&&&&&!!!&&&&&23&$4#10$,'0$))"%-#1'%&#$&%&%'(&&&&&&&&&&&&&&&&&&&&&&!!!&&&&&5+*56)1"2)&51#6-1"7#5-)&51#6+28-"*%&+)&51#6$)+",))&&&!!!&&&&"+,#''-"&.+'/0$,'0$))"%-#1'%&+'/0$,'0$))"%-#1'%&&&&&&&!!!&&&&&23&$4#10$,'0$))"%-#1'%&,&%&%'(&&&&&&&&&&&&&&&&&&&&&&&!!!&&&&&5+*56)1"2)&51#6-1"7#5-)&51#6+28-"*%&+)&51#6$)+",)&&&&!!!&&&&&51#6,)1""-/))&&&&&&&&&&&&&&&&&&&&&&&&&&&&&&&&&&&&&&&&!!!&&&&"132'&.+'/0$,'0$))"%-#1'%&+'/0$,'0$))"%-#1'%)(&&&&&&&&!!!&&&&*127-"%(&1$#''-"7"--))&9127-"%(&1$#''-"7"--))&&&&&&&&&!!!&&&&91#6-1"-."/)0(&1#6+28-"*%&1))&9'2'1-":;2'<8=:)&&&&&&&&!!!&&&&*1#6-1"-."/)2(&#7!1-"%)&1#6+28-"*%&1))&&&&&&&&&&&&&&&&!!!&&&&*'2'1-":;2'<.8=:)&1-!-7/")33)&+,%-5-"+>,)1)")!

Maurizio Pisati Exploratory spatial data analysis using Stata 82/91

Page 83: Exploratory spatial data analysis using Stata - DPCtypo3.dpc.de/fileadmin/templates/dpc-statistik/doc_pdf/stata_user... · Introduction Spatial data Visualizing spatial data Exploring

IntroductionSpatial data

Visualizing spatial dataExploring spatial point patternsDetecting spatial autocorrelation

References

OverviewSpatial weights matricesMeasuring spatial autocorrelationGlobal indices of spatial autocorrelationLocal indices of spatial autocorrelation

Local indices of spatial autocorrelation: example

LucasFulton

Geauga

Cuyahoga

Ottawa

Wood

Lorain SanduskyTrumbullErie

Defiance

Summit

PortageHuron

Medina

Seneca

Hancock

Mahoning

Ashland Crawford

Richland

WayneVan Wert

Wyandot

Stark

Columbiana

Allen

CarrollMorrow

MarionAuglaize

HolmesTuscarawas

Jefferson

Logan

Union

Shelby

Coshocton

Harrison

Darke

Licking

Champaign

Guernsey

Miami

Belmont

Muskingum

Franklin

Madison

Clark

NobleFairfield

Perry

Montgomery Preble

MonroeGreene

PickawayMorgan

Fayette

HockingWashington

WarrenButler

Clinton

Athens RossHighland

Hamilton

Clermont

Brown

Meigs

AshtabulaLake

WilliamsHenry

Paulding

Putnam

Hardin

MercerKnox

Delaware

Vinton

Jackson

Pike

Adams

Gallia

Scioto

Lawrence

-2

-1

0

1

2

3

Wz

-2 -1 0 1 2 3 4z

Maurizio Pisati Exploratory spatial data analysis using Stata 83/91

Page 84: Exploratory spatial data analysis using Stata - DPCtypo3.dpc.de/fileadmin/templates/dpc-statistik/doc_pdf/stata_user... · Introduction Spatial data Visualizing spatial data Exploring

IntroductionSpatial data

Visualizing spatial dataExploring spatial point patternsDetecting spatial autocorrelation

References

OverviewSpatial weights matricesMeasuring spatial autocorrelationGlobal indices of spatial autocorrelationLocal indices of spatial autocorrelation

Local indices of spatial autocorrelation: example

Step 4

We use spmap to draw the cluster map corresponding to the Moranscatterplot

!"#$"%#!"&"'(&"))*+,$-(+%.!/01%23).0(/,!43))*5/0$(,!65($2!%%"""%%%%/5#/5$%'-#,(+)5#.0/7.,$%8')-)*#9-.,%*,5%&'(%*,5$%%%%%%%%%"""%%%%-$9,-#:#:&'))*5$%;#;&'))*5$%-$9,-#0$#,$%')-)*#<+/(,$%%%%%"""%%%%%%!/=,#%&')$%!,-,'(#>,,"%/8%#!"&"'(&"))*+,$-(+*+'$$%%%%%%"""%%%%-,1,05#)88$!

Maurizio Pisati Exploratory spatial data analysis using Stata 84/91

Page 85: Exploratory spatial data analysis using Stata - DPCtypo3.dpc.de/fileadmin/templates/dpc-statistik/doc_pdf/stata_user... · Introduction Spatial data Visualizing spatial data Exploring

IntroductionSpatial data

Visualizing spatial dataExploring spatial point patternsDetecting spatial autocorrelation

References

OverviewSpatial weights matricesMeasuring spatial autocorrelationGlobal indices of spatial autocorrelationLocal indices of spatial autocorrelation

Local indices of spatial autocorrelation: example

Williams

Henry

PauldingPutnam

HardinMercer

KnoxDelaware

Vinton

JacksonPike

Adams GalliaScioto

Lawrence

Maurizio Pisati Exploratory spatial data analysis using Stata 85/91

Page 86: Exploratory spatial data analysis using Stata - DPCtypo3.dpc.de/fileadmin/templates/dpc-statistik/doc_pdf/stata_user... · Introduction Spatial data Visualizing spatial data Exploring

IntroductionSpatial data

Visualizing spatial dataExploring spatial point patternsDetecting spatial autocorrelation

References

References

Maurizio Pisati Exploratory spatial data analysis using Stata 86/91

Page 87: Exploratory spatial data analysis using Stata - DPCtypo3.dpc.de/fileadmin/templates/dpc-statistik/doc_pdf/stata_user... · Introduction Spatial data Visualizing spatial data Exploring

IntroductionSpatial data

Visualizing spatial dataExploring spatial point patternsDetecting spatial autocorrelation

References

References I

• Anselin, L. 1995. Local indicators of spatial association – LISA.Geographical Analysis 27: 93–115.

• Bailey, T.C. and A.C. Gatrell. 1995. Interactive Spatial Data Analysis.Harlow: Longman.

• Getis, A. and J.K. Ord. 1992. The analysis of spatial association by use ofdistance statistics. Geographical Analysis 24: 189–206.

• Haining, R., Wise, S. and J. Ma. 1998. Exploratory spatial data analysis ina geographic information system environment. The Statistician 47: 457–469.

• Moran, P. 1948. The interpretation of statistical maps. Journal of the RoyalStatistical Society, Series B 10: 243–251.

• Pfeiffer, D., Robinson, T., Stevenson, M., Stevens, K., Rogers, D. and A.Clements. 2008. Spatial Analysis in Epidemiology. Oxford: OxfordUniversity Press.

• Pisati, M. 2001. sg162: Tools for spatial data analysis. Stata TechnicalBulletin 60: 21–37. In Stata Technical Bulletin Reprints, vol. 10, 277–298.College Station, TX: Stata Press.

Maurizio Pisati Exploratory spatial data analysis using Stata 87/91

Page 88: Exploratory spatial data analysis using Stata - DPCtypo3.dpc.de/fileadmin/templates/dpc-statistik/doc_pdf/stata_user... · Introduction Spatial data Visualizing spatial data Exploring

IntroductionSpatial data

Visualizing spatial dataExploring spatial point patternsDetecting spatial autocorrelation

References

References II

• Slocum, T.A., McMaster, R.B., Kessler, F.C. and H.H. Howard. 2005.Thematic Cartography and Geographic Visualization. 2nd ed. Upper SaddleRiver, NJ: Pearson Prentice Hall.

• Tobler, W.R. 1970. A computer movie simulating urban growth in theDetroit region. Economic Geography 46: 234–240.

• Waller, L.A. and C.A. Gotway. 2004. Applied Spatial Statistics for PublicHealth Data. Hoboken, NJ: Wiley.

Maurizio Pisati Exploratory spatial data analysis using Stata 88/91

Page 89: Exploratory spatial data analysis using Stata - DPCtypo3.dpc.de/fileadmin/templates/dpc-statistik/doc_pdf/stata_user... · Introduction Spatial data Visualizing spatial data Exploring

IntroductionSpatial data

Visualizing spatial dataExploring spatial point patternsDetecting spatial autocorrelation

References

Appendix

Maurizio Pisati Exploratory spatial data analysis using Stata 89/91

Page 90: Exploratory spatial data analysis using Stata - DPCtypo3.dpc.de/fileadmin/templates/dpc-statistik/doc_pdf/stata_user... · Introduction Spatial data Visualizing spatial data Exploring

IntroductionSpatial data

Visualizing spatial dataExploring spatial point patternsDetecting spatial autocorrelation

References

Source code for genmsp I

program genmsp, sortpreserve

version 12.1

syntax varname, Weights(name) [Pvalue(real 0.05)]

unab Y : ‘varlist’

tempname W

matrix ‘W’ = ‘weights’

tempvar Z

qui summarize ‘Y’

qui generate ‘Z’ = (‘Y’ - r(mean)) / sqrt( r(Var) * ( (r(N)-1) / r(N) ) )

qui cap drop std_‘Y’

qui generate std_‘Y’ = ‘Z’

tempname z Wz

qui mkmat ‘Z’, matrix(‘z’)

matrix ‘Wz’ = ‘W’*‘z’

matrix colnames ‘Wz’ = Wstd_‘Y’

qui cap drop Wstd_‘Y’

qui svmat ‘Wz’, names(col)

qui spatlsa ‘Y’, w(‘W’) moran

tempname M

matrix ‘M’ = r(Moran)

matrix colnames ‘M’ = __c1 __c2 __c3 zval_‘Y’ pval_‘Y’

Maurizio Pisati Exploratory spatial data analysis using Stata 90/91

Page 91: Exploratory spatial data analysis using Stata - DPCtypo3.dpc.de/fileadmin/templates/dpc-statistik/doc_pdf/stata_user... · Introduction Spatial data Visualizing spatial data Exploring

IntroductionSpatial data

Visualizing spatial dataExploring spatial point patternsDetecting spatial autocorrelation

References

Source code for genmsp II

qui cap drop __c1 __c2 __c3

qui cap drop zval_‘Y’

qui cap drop pval_‘Y’

qui svmat ‘M’, names(col)

qui cap drop __c1 __c2 __c3

qui cap drop msp_‘Y’

qui generate msp_‘Y’ = .

qui replace msp_‘Y’ = 1 if std_‘Y’<0 & Wstd_‘Y’<0 & pval_‘Y’<‘pvalue’

qui replace msp_‘Y’ = 2 if std_‘Y’<0 & Wstd_‘Y’>0 & pval_‘Y’<‘pvalue’

qui replace msp_‘Y’ = 3 if std_‘Y’>0 & Wstd_‘Y’<0 & pval_‘Y’<‘pvalue’

qui replace msp_‘Y’ = 4 if std_‘Y’>0 & Wstd_‘Y’>0 & pval_‘Y’<‘pvalue’

lab def __msp 1 "Low-Low" 2 "Low-High" 3 "High-Low" 4 "High-High", modify

lab val msp_‘Y’ __msp

end

exit

Maurizio Pisati Exploratory spatial data analysis using Stata 91/91