Writeup_SleuthThreshold

17
Thresholding SLEUTH Development Model Outputs for 2030 Audrey Archer August 24, 2015 Objective The purpose of this analysis is to create a binary map product of development in North Carolina for the year 2030 using SLEUTH development model outputs. Because the SLEUTH data contains probabilities of conversion, a threshold value must be determined. Thresholds were based on the amount of development that occurred between 2006 and 2011 according to National Land Cover Dataset change products. Intersecting this product (summarized at the NHD+ catchment level) with a layer of high conservation value catchments (as determined in a previous analysis) will illuminate those catchments that warrant the state’s protection. The presumption is that catchments with a high conservation value and a high risk of development should have a greater priority than a catchment with high conservation value that is unlikely to be developed in the near future. Workspace The workspace for this analysis is contained in the Project_02_SLEUTH_Dev folder: This workspace includes a folder (SLEUTH_Dev_Pre-Run) that contains the final outputs when the analysis is run.

Transcript of Writeup_SleuthThreshold

Page 1: Writeup_SleuthThreshold

Thresholding SLEUTH Development Model Outputs for 2030

Audrey Archer August 24, 2015

Objective

The purpose of this analysis is to create a binary map product of development in North Carolina for the

year 2030 using SLEUTH development model outputs. Because the SLEUTH data contains probabilities of

conversion, a threshold value must be determined. Thresholds were based on the amount of

development that occurred between 2006 and 2011 according to National Land Cover Dataset change

products. Intersecting this product (summarized at the NHD+ catchment level) with a layer of high

conservation value catchments (as determined in a previous analysis) will illuminate those catchments

that warrant the state’s protection. The presumption is that catchments with a high conservation value

and a high risk of development should have a greater priority than a catchment with high conservation

value that is unlikely to be developed in the near future.

Workspace The workspace for this analysis is contained in the Project_02_SLEUTH_Dev folder:

This workspace includes a folder (SLEUTH_Dev_Pre-Run) that contains the final outputs when the

analysis is run.

Page 2: Writeup_SleuthThreshold

2

Data Sources There are two primary data sources for this analysis. The first is SLEUTH-modeled urban growth in the

Southeast by 2020 and 2030 (Belyea and Terando 2014). These datasets contain probabilities of rural to

urban land conversion at a 60 meter resolution. These are saved in \GISData\Inputs\sleuth2020 and

\GISData\Inputs\sleuth2030. The second data source is the National Land Cover 2006 to 2011 land cover

change (Homer et al. 2015). This dataset contains only those pixels identified as changed between 2006

and 2011 at a 30 meter resolution. This data set can be found in EEP_GIS_Workspace\Data\NLCD

data.sde\USdata.DBO.NLCD2006_LANDCOVER_FROMTO_CHANGE_INDEX_5_2_11.

There are two secondary datasets for this analysis: the boundaries of 8-digit HUC watersheds (USDA-

NRCS 2013) and NHD+ catchments (USEPA and USGS 2012). These were used to summarize raster values

from the two primary datasets. They are saved as \GISData\Inputs\HUC8_NC.shp and

\GISData\Inputs\NHDCatch_NC.shp.

Methods In order to derive a binary map of development risk using SLEUTH data, a threshold must be applied.

The steps taken to determine threshold values are as follows:

Step 1: Calculate the amount of observed development for each 8-digit HUC in a 10-yr period

using NLCD change products.

Step 2: Determine the likelihood of development threshold for the 2020 SLEUTH projection that

yields approximately the same amount of observed urbanization for each 8-digit HUC.

Step 3: Apply the threshold from Step 2 to the 2030 SLEUTH projection.

Step 4: Calculate summary statistics at the NHD+ catchment level.

Step 1: Calculate Observed Development Thresholds were based on the amount of development that occurred between 2006 and 2011 for each

8-digit HUC. The original data (NLCD change product) includes pixels that converted from one land cover

type to any other. Thus, the data set was modified to contain only those pixels that were non-urban in

2006 and urban in 2011. These pixels were summed for each 8-digit HUC watershed. Because the

SLEUTH datasets are in 10 year increments, the converted pixel counts were doubled to approximate

the amount of observed converted area in 10 years. The steps taken to calculate observed development

are contained in the model \GISData\Sleuth.tbx\Step01 Calculate Observed Development (Appendix A.1:

Calculate Observed Development). They are as follows:

Extract the NLCD change data to the state of North Carolina.

Extract pixels that were non-urban in 2006 and urban in 2011.

(Note: urban pixels are values 21 - 24.)

o Set null pixels that were not urban in 2011; false values = 1

o Set null pixels that were urban in 2006; false values = 1

Use Raster Calculator to add both set null outputs. Those with values of 2 comprise the

observed development data set.

Use Zonal Statistics as Table to calculate the sum of “observed development” pixels per HUC8.

By calculating the sum, the amount of observed development is effectively developed since

each pixel has a value of 2.

Page 3: Writeup_SleuthThreshold

3

Join observed development pixels to the HUC8 shapefile.

Select only those HUC8 that contain observed development pixels. This step is necessary,

because the python script that follows requires that only HUCs with values be processed. The

output of this model is saved as \GISData\Outputs\HUC8_ObsDev.shp.

Step 2: Determining Likelihood of Development Threshold for SLEUTH 2020 Thresholds were determined as the likelihood value of 2020 SLEUTH projection data that most closely

produces the amount of urbanization that was observed to occur in a 10 year period. This was

accomplished by calculating a running total of pixels classified as developed when thresholded from a

likelihood of < 2.5 to > 97.5%. The “existing urban” (value = 1) pixels were not included in this

calculation, since new development is the primary concern for this analysis The optimal threshold value

is identified as the one with the least difference between the running total for each likelihood value and

the number of observed conversion pixels. Figure 1 is a diagram of this method. According to the figure,

the most appropriate likelihood value to use as a threshold for HUC 06010105 is 70% (highlighted in

red), because it results in the most similar number of developed pixels in 2020 as it did in the observed

dataset (~ 20,000 pixels).

Figure 1. Graphic of process for determining the threshold for HUC 06010105 based on observed development

The steps taken to calculate observed development are contained in the script

\Scripts\Steps02_03_SleuthThreshold.py (Appendix B.1: Step 2: Determining Likelihood of Development

Threshold). If you wish to run this script, it is critical that you change the workspace to the location of

the Project_02_SLEUTH_Dev folder (line 13 of the script). Below is an outline of the steps in the script:

Split Sleuth 2020 rasters for each HUC8 in the \GISData\Outputs\HUC8_ObsDev.shp (created in

the previous step).

Use Copy Rows to create an attribute table for each raster.

Delete rows with a value of 1 (existing urban).

Add field “Count_30m” of the count divided by 4, because SLEUTH pixels are four times bigger

than NLCD pixels.

0

5000

10000

15000

20000

25000

30000

35000

40000

<2.5 5 10 20 30 40 50 60 70 80 90 95 97.5 > 97.5

Nu

mb

er o

f P

ixel

s

SLEUTH Development Likelihood (%)

Cumulative Development (2020) Observed Development

Page 4: Writeup_SleuthThreshold

4

Sort the table in descending order of “Count_3m.”

Calculate a running total of “Count_3m” and set values to “RunSum” field.

Add field “ObsDev” with the observed development sum for corresponding HUC8.

Add field “Diff”, which is calculated as the difference between “RunSum” and “ObsDev.”

Select the SLEUTH likelihood with the minimum value in the “Diff” field, which will be the

threshold for the corresponding HUC8.

o Sort “Diff” in Ascending order and Get Value of the first record.

Create a dictionary of HUC8 Codes as the key and Thresholds as the value.

Create a .csv file of the dictionary keys and values. Because the HUC codes have a leading zero

that must be retained in order to join it to the HUC 8 shapefile, a few additional steps are

needed:

o Open a blank excel document.

o Navigate to Data Get External Data From Text.

o Set Delimiters to Tab and Comma.

o Set the column data format for the first column (HUC8) to Text. Leave the second

column (Thresh) as General.

o Click Finish.

o Save this as an excel document: \Spreadsheets\HUC8_Thresh_leading0.xlxs.

Join (from Joins and Relates) the HUC8_Thresh_leading0.xlxs file to

\GISData\Inputs\HUC8_NC.shp

o Note: The Join Field (Data Management) tool is unreliable. It is best to join from Joins

and Relates.

Export (and replace) the joined shapefile as \GISData\Outputs\HUC8_Thresholds.shp.

o Note: Those HUCs with a Threshold value of 0 did not contain any “observed

development” pixels.

Step 3: Applying the Threshold to SLEUTH 2030 The decision to apply a likelihood of development threshold that is appropriate in 2020 to a 2030

dataset is a conservative one; it applies the same rate of observed urbanization (2006 – 2011, multiplied

by 2) to the following decade, rather than for accounting for exponential growth rates.

The steps taken to calculate observed development are contained in the script

\Scripts\Steps02_03_SleuthThreshold.py (Appendix B.2: Step 3: Applying the Threshold). Note: this

portion of the script is commented out with triple quotes; these must be deleted once ready to

complete Step 3. Below is an outline of the steps in the script:

Split Sleuth 2030 rasters for each HUC8. These are put in a new folder called

\Scratch\HUC8_Sleuth30.

Using the Threshold Dictionary, use Con to set all pixels with a value >= to the threshold as 1 (at

risk) and false values to 0 (not at risk).

Mosaic to new raster in order to merge all of the HUC8 con rasters together.

Build an attribute table for the mosaicked dataset.

The result of this process is the binary map product: \GISData\Outputs\ThreshDev_2030_NC.img.

Page 5: Writeup_SleuthThreshold

5

Step 4: Calculate Summary Statistics at the Catchment Level Summarizing the binary map of development risk at the catchment level is not as simple as calculating

the percent area of a catchment that is at risk, because catchments are generally not 100% developable.

For instance, a catchment that is entirely composed of water or existing urban land should not be

predicted to be developed in 2030. Instead, the best way to summarize this data is to divide the number

of pixels at risk of development (i.e., above the threshold) by the total number of developable pixels in

the catchment. Because it is unknown what mask SLEUTH used to consider a pixel developable, the best

way to summarize the data is to divide the number of pixels above the threshold by the number of total

SLEUTH pixels minus the pixels classified as existing urban (Value = 1) (Equation 1). It is possible (and

likely) that this is not the same as what SLEUTH considered to be physically developable, but that a

majority of developable pixels had such a low probability of being developed that it was classified as No

Data.

𝑃𝑟𝑜𝑝𝑜𝑟𝑡𝑖𝑜𝑛 𝑜𝑓 𝐶𝑎𝑡𝑐ℎ𝑚𝑒𝑛𝑡′𝑠 𝐷𝑒𝑣𝑒𝑙𝑜𝑝𝑎𝑏𝑙𝑒 𝐴𝑟𝑒𝑎 𝐷𝑒𝑣𝑒𝑙𝑜𝑒𝑑 = # Pixels Above Threshold

# SLEUTH Pixels-Existing Urban

Equation 1

The steps taken to calculate observed development are contained in the model

\GISData\Sleuth.tbx\Step04 Threshold Catchment Stats (Appendix A.2: Calculate Threshold Statistics).

There are three strings of analysis to obtain values for the numerator, denominator, and ratio of the

two. They are as follows:

Numerator

Use Zonal Statistics to calculate the sum (or count, because all values are 1) of the mosaicked,

thresholded dataset for each catchment.

Copy the zonal statistics output (“SUM”) to another column with a meaningful field name

(“ThreshSum”)

Denominator

Set Null those pixels classified as existing urban in the Sleuth 2030.

Use Zonal Statistics to calculate the sum of this dataset.

Copy the output to another column with a meaningful name (“TotSum”)

Numerator : Denominator

Join these two fields (numerator and denominator) to a copy of the NHDCatch_NC.shp.

Use Select to isolate those catchments with a denominator value of 0. This shapefile is useful for

mapping purposes only.

Use Select to isolate those catchments with a denominator value greater than 0.

Calculate Field as the ratio of numerator to denominator.

Page 6: Writeup_SleuthThreshold

Results In the first step of this analysis, it was discovered that there was approximately 490 km2 of observed

development (i.e., amount of area that was non-urban in 2006 and urban in 2011). This number was

doubled to account for a 10 year time span.

Out of the 58 HUC 8 watersheds, only 45 contained observed development pixels. Unsurprisingly, there

is a great deal of variation in amount of observed development amongst these watersheds, ranging from

only 4 pixels to nearly 200,000 (cell resolution = 60m2). There is an average of approximately 24,000

(21.6 km2 pixels per HUC8).

A majority of the HUC 8 watersheds (23) had a very low threshold, with a SLEUTH predicted

development value of < 2.5%. In contrast, the second most frequent HUC 8 watershed threshold was

97.5% (Figure 2). There was a more even spread of HUC 8 watersheds with intermediate threshold

values. A low threshold translates to a lower confidence that a given pixel will be developed is needed to

be classified as at risk of development in the binary map product; whereas, higher threshold values

translates to a higher confidence is needed. Thus, those watersheds with greater observed development

(typically near existing cities) are also the ones that have a lower threshold (Figure 3).

After applying the appropriate threshold to the 2030 SLEUTH dataset, there was a total of 975 km2 that

that had a SLEUTH likelihood of development value that exceeded the threshold and thus are

considered at risk of development in this analysis. The remaining 1361 km2 had likelihood values below

threshold (Figure 4). A third of the catchments are 100% at risk of development where there is the

potential for development. This catchments correspond with those HUC 8 watersheds with lower

thresholds (Figure 5). Twenty-percent of the catchments are 50% at risk of development where there is

the potential for development.

Figure 2. Frequency distribution of thresholds for HUC8 watersheds.

23

0

1

2

1

2

1 1

4

2

0

1

7

< 2 . 5 5 1 0 2 0 3 0 4 0 5 0 6 0 7 0 8 0 9 0 9 7 . 5 > 9 7 . 5

NU

MB

ER O

F H

UC

8

LIKELIHOOD OF DEVELOPMENT THRESHOLD (%)

Page 7: Writeup_SleuthThreshold

7

Figure 3. Development Likelihood Thresholds for HUC8 watersheds.

Page 8: Writeup_SleuthThreshold

8

Figure 4. SLEUTH 2030 data when thresholded by corresponding HUC8 threshold.

Page 9: Writeup_SleuthThreshold

9

Figure 5. Binary map summarized at the NHD+ catchment level. Percent developable area that is classified as developed when threshold is applied to SLEUTH 2030 data.

Page 10: Writeup_SleuthThreshold

10

Sources Cited Belyea, Curtis M. and Terando, Adam J. (2014). Southeast Regional Assessment Project; Biodiversity and

Spatial Information Center, North Carolina State University. Atlantic Coast Joint Venture USGS

Cooperative Fish & Wildlife Research Units of North Carolina and Alabama Association for Fish &

Wildlife Agencies USGS Gap Analysis Program USGS Pautuxent Research Lab. Retrieved from

http://salcc.databasin.org/datasets/e5860ced8b4844e88431cdbefe425e1a

U.S. Department of Agriculture-Natural Resources Conservation Service (USDA-NRCS), United States

Geological Survey (USGS), and the Environmental Protection Agency (EPA) (2013). Watershed

Boundary Dataset for North Carolina. Retrieved from http:\\datagateway.nrcs.usda.gov.

U.S. Geological Survey (USGS) and the U.S. Environmental Protection Agency (USEPA) (2012). National

Hydrologic Dataset (NHD) Medium Resolution. Retrieved from http://www.horizon-

systems.com/NHDPlus/NHDPlusV2_data.php.

Homer, C.G., Dewitz, J.A., Yang, L., Jin, S., Danielson, P., Xian, G., Coulston, J., Herold, N.D., Wickham,

J.D., and Megown, K. (2015). Completion of the 2011 National Land Cover Database for the

conterminous United States-Representing a decade of land cover change

information. Photogrammetric Engineering and Remote Sensing, v. 81, no. 5, p. 345-354.

Page 11: Writeup_SleuthThreshold

11

Appendices

Appendix A: GIS Models

Appendix A.1: Calculate Observed Development

Page 12: Writeup_SleuthThreshold

12

Appendix A.2: Calculate Threshold Statistics

Page 13: Writeup_SleuthThreshold

13

Appendix B: Python Script

Appendix B.1: Step 2: Determining Likelihood of Development Threshold

Page 14: Writeup_SleuthThreshold

14

Page 15: Writeup_SleuthThreshold

15

Page 16: Writeup_SleuthThreshold

16

Appendix B.2: Step 3: Applying the Threshold

Page 17: Writeup_SleuthThreshold

17

Appendix C.1: Slideshow Presentation