Presented by Keith Wurtz Senior Research Analyst Chaffey Community College Keith.wurtz@chaffey

28
Using Census Data, District Data, with GIS, SPSS, and Answer Tree to Identify possible populations to market to, and increase enrollments Presented by Keith Wurtz Senior Research Analyst Chaffey Community College [email protected]

description

Using Census Data, District Data, with GIS, SPSS, and Answer Tree to Identify possible populations to market to, and increase enrollments. Presented by Keith Wurtz Senior Research Analyst Chaffey Community College [email protected]. Introduction. How to Create a District Map? - PowerPoint PPT Presentation

Transcript of Presented by Keith Wurtz Senior Research Analyst Chaffey Community College Keith.wurtz@chaffey

Page 1: Presented by Keith Wurtz Senior Research Analyst Chaffey Community College Keith.wurtz@chaffey

Using Census Data, District Data, with GIS, SPSS, and Answer Tree to Identify possible populations to market to, and

increase enrollments

Presented by Keith Wurtz

Senior Research Analyst

Chaffey Community College

[email protected]

Page 2: Presented by Keith Wurtz Senior Research Analyst Chaffey Community College Keith.wurtz@chaffey

Introduction

• How to Create a District Map?

• How to Merge Census Data into a GIS Map?

• Using Census Data and District Data to Identify possible populations to market to, and increase enrollments

Page 3: Presented by Keith Wurtz Senior Research Analyst Chaffey Community College Keith.wurtz@chaffey

Ontario

Fontana

Chino Hills

Chino

UplandRancho Cucamonga

Montclair

San Antonio Heights

0 3 61.5 MilesPrepared by Keith WurtzDate: 20060406

Chaffey College District

Page 4: Presented by Keith Wurtz Senior Research Analyst Chaffey Community College Keith.wurtz@chaffey

How do I create a map of my District in GIS?

• Open ArcMap• Click on the “+” sign to add data

– The data that you want to first add is the form of shape files– Shape files are the type of files the GIS uses to create maps– Since Chaffey’s District is in San Bernardino County (#71) I am going

to start with shape files from that county– Shape File Types

• BLK – Data by Census Blocks• GRP – Data by Block Groups• TRT – Census Tract• ZCTA – Zip Code• Place – Cities• CTY – County• LKA - Streets

– To create the map on the previous slide I am going to choose the Place file or data by City (i.e. tgr06071place00.shp) and the ZCTA file or zip code data (i.e. tgr06071zcta5cu.shp)

Page 5: Presented by Keith Wurtz Senior Research Analyst Chaffey Community College Keith.wurtz@chaffey

Creating District Map (Continued)

• Once the zip code and city shape files have been inserted you can see where the zip codes and the cities are in your county– Next double click on the shaded rectangles under layers and

choose “Hollow” and “OK” for zip codes and city– I am interested in the southwest portion of the county where our

District is located.– Note. You can highlight the layer by checking or un-checking the

boxes– To highlight this click on the magnifying glass and highlight this

portion of the county• Double click on the place shape file and choose labels

– Check Label features in this layer and choose “OK”– Now you can see each city in the county as well as the cities in

Chaffey’s District

Page 6: Presented by Keith Wurtz Senior Research Analyst Chaffey Community College Keith.wurtz@chaffey

Creating District Map (Continued)

• Select only the cities in Chaffey’s District– Click on the black arrow “ “ (i.e. “Select Features” icon– Click on each city and hold the control key down– Right mouse click on the place shape file and choose “Selection”

and “Create Layer from Selected Features”– Un-check the place shape file– Turn on the label features on the District Layer that you just

created and make it Hollow

• Double Click on the Chaffey District Layer that you created and change the font color of the city names

• Double Click on the Chaffey District rectangle under Layers and change the line color

Page 7: Presented by Keith Wurtz Senior Research Analyst Chaffey Community College Keith.wurtz@chaffey

Creating District Map (Continued)

• Click on View and then Layout View– Go to View and then Data Frame Properties

• Choose Frame• Click on Color and choose No Color

– Insert a Title by clicking Insert and Title– Insert Text by clicking Insert and Text– Insert Scale Bar by clicking Insert and Scale Bar

• Notice that scale is in decimal degrees• To change this, double click on it and under the Scale and

Units Tab choose Division Units and then choose Miles

Page 8: Presented by Keith Wurtz Senior Research Analyst Chaffey Community College Keith.wurtz@chaffey

Ontario

Fontana

Chino Hills

Chino

UplandRancho Cucamonga

Montclair

San Antonio Heights

0 3 61.5 MilesPrepared by Keith WurtzDate: 20060406

2000 US Census Population Data inthe Chaffey College District

Legend

2000 US Census Population Data

Census2-SF1.TOTPOP

25 - 496

497 - 817

818 - 1088

1089 - 1371

1372 - 1713

1714 - 2259

2260 - 3025

3026 - 4552

4553 - 7658

7659 - 11889

Page 9: Presented by Keith Wurtz Senior Research Analyst Chaffey Community College Keith.wurtz@chaffey

Inserting and Using Census Data into District Map

• Obtaining US Census Data– Go to the following: http://www.census.gov/– Click on “American Fact Finder”– Go to “Data Sets” and click on “Decennial Census” (Note: American

Community Survey)– Click on “Detailed Tables” under SF 1– Click on “geo within geo”

• Under “Show me all” click on “Block Groups”• Under “Within” click on “County”• Under “Select a State” click on “CA”• Under “Select a County” click on “San Bernardino”• Under “Select one or more…” click on “All Block Groups” and click on “Add”

– Click “Next”– Under “Select one or more…” click on “P1. Total Population” and click

on “Add” (Note. You can choose more than one and it will still work)– Click on “Show Result”

Page 10: Presented by Keith Wurtz Senior Research Analyst Chaffey Community College Keith.wurtz@chaffey

File Downloaded fromthe Census Bureau

• Click on “Print/Download” and click on “Download”– Choose Excel, hold down the

control key, and click “OK”– Click on “Open” and then open

the Excel file with “…data…” in the name. It is usually the largest file, but not always.

• Change the GEOGRAPHY_ field name to JOINID. This is the field that is going to match with the STFID

• Save this file as a dbf file.

Page 11: Presented by Keith Wurtz Senior Research Analyst Chaffey Community College Keith.wurtz@chaffey

Merging or Joining in GIS• Need to insert the block groups data for San Bernardino County

– Click on the “+” sign and insert the tgr06071grp00.shp file• Right Click on “Block Groups” in GIS because this is at the level in

which I downloaded the Census data– Go to “Join and Relates” and Click on “Join”– Make sure that the layer joining from is an attributes table– In Number 1 Choose STFID– In Number 2 Click on Folder and find the dbf file that you created:

RPPop.dbf– In Number 3 choose the “Join ID” field, click OK, and then Click “Yes”

• Check to see that the “Join” worked by right clicking on block groups, clicking on “Open Attribute Table,” and scrolling to the right to see if the total population field is there

Page 12: Presented by Keith Wurtz Senior Research Analyst Chaffey Community College Keith.wurtz@chaffey

Snap of Block Group Data that includes Total Population

• In this case there is a “Null” field in the top row.

• This indicates that the join did not work for this block group

• San Bernardino County is one of the few counties in the Country that has an error in one of its block groups

Page 13: Presented by Keith Wurtz Senior Research Analyst Chaffey Community College Keith.wurtz@chaffey

Incorporating US Census Data (Continued)

• Selecting Block Groups in Chaffey’s District– Right mouse click on the block group and open the attribute

table– Click on the Select Features icon: – Make sure that the block groups rectangle is checked so that

you can see the block groups– Click to the left of the map and select the district

• Once you have selected most of the district it is best to open the attributes table to make sure that all of the block groups have been selected

• Right mouse click on the block group and open the attributes table• Hold the control key down and select the block groups by clicking on

the row– Right mouse click on the block group and choose “Selection”

and choose “Create layer from Selected Features”– Uncheck the initial block group

Page 14: Presented by Keith Wurtz Senior Research Analyst Chaffey Community College Keith.wurtz@chaffey

Incorporating US Census Data (Continued)

• Displaying the US Population Data– Double click on the Layer that we just created– Click on Symbology– Click on Quantities– In the Value Field choose the total population

field and change classes to 10 instead of 5– Click OK

Page 15: Presented by Keith Wurtz Senior Research Analyst Chaffey Community College Keith.wurtz@chaffey

Using Census Data and District Data to Identify possible populations to market

to and increase enrollments

Page 16: Presented by Keith Wurtz Senior Research Analyst Chaffey Community College Keith.wurtz@chaffey

Participation Rates of 2000 – 2001 Chaffey Students by Age

Age # N %

18 – 19 Years 4,018 21,968 18.3

20 – 24 Years 6,066 50,091 12.1

25 – 29 Years 2,395 50,686 4.7

30 – 34 Years 1,752 57,004 3.1

35 – 39 Years 1,605 64,342 2.5

40 – 49 Years 2,304 107,749 2.1

50 – 65 Years 1,071 78,606 1.4

Total 19,211 430,446 4.5Note. # refers to the number of students attending Chaffey in the 2000 – 2001 academic year. N refers to the population living in the Chaffey College District taken from the 2000 US Census.

Page 17: Presented by Keith Wurtz Senior Research Analyst Chaffey Community College Keith.wurtz@chaffey

Marketing to 40 – 49 Year Olds

• US Census Data allows us to identify the number of 40 – 49 year olds living in each block group

• We can use the mapping software to identify where 40 – 49 year olds live

• Once we know where they live, we can use segmentation modeling (i.e. answer tree or classification tree) to identify enrollment characteristics of these students and then market to them

Page 18: Presented by Keith Wurtz Senior Research Analyst Chaffey Community College Keith.wurtz@chaffey

Segmentation Modeling

• According to Borges and Cherpitel (2001), segmentation modeling (i.e. classification tree models) are based on the principle of binary recursive partitioning. Binary recursive partitioning is where the values of the dependent variable (i.e. success and non-success) are examined for all possible splits of the data at each step of the tree-building process to find the split that most effectively separates the dependent variable into homogeneous groups until it is not possible to continue (Borges and Cherpitel, 2001). The model attempts to maximize the number of students who are correctly classified as successes and those who are correctly classified as non-successes.

Page 19: Presented by Keith Wurtz Senior Research Analyst Chaffey Community College Keith.wurtz@chaffey

Enrollment Variables used in Segmentation Model

• Used MIS to identify enrollment characteristics– Transfer course enrollment– Basic skills course enrollment– Occupational course enrollment– Credit course enrollment– School– Location of course– Term

• Created field for each one that generated number of enrollments aggregated by student

Page 20: Presented by Keith Wurtz Senior Research Analyst Chaffey Community College Keith.wurtz@chaffey

Category % nAll other ages 88.52 2347740 - 49 year olds 11.48 3044Total (100.00) 26521

Node 0

Category % nAll other ages 93.96 1123240 - 49 year olds 6.04 722Total (45.07) 11954

Node 2

Category % nAll other ages 96.36 490140 - 49 year olds 3.64 185Total (19.18) 5086

Node 6

Category % nAll other ages 96.61 467540 - 49 year olds 3.39 164Total (18.25) 4839

Node 14Category % nAll other ages 91.50 22640 - 49 year olds 8.50 21Total (0.93) 247

Node 13

Category % nAll other ages 92.18 633140 - 49 year olds 7.82 537Total (25.90) 6868

Node 5

Category % nAll other ages 95.12 183440 - 49 year olds 4.88 94Total (7.27) 1928

Node 12Category % nAll other ages 91.03 449740 - 49 year olds 8.97 443Total (18.63) 4940

Node 11

Category % nAll other ages 84.06 1224540 - 49 year olds 15.94 2322Total (54.93) 14567

Node 1

Category % nAll other ages 91.17 453440 - 49 year olds 8.83 439Total (18.75) 4973

Node 4

Category % nAll other ages 88.62 249240 - 49 year olds 11.38 320Total (10.60) 2812

Node 10Category % nAll other ages 94.49 204240 - 49 year olds 5.51 119Total (8.15) 2161

Node 9

Category % nAll other ages 80.37 771140 - 49 year olds 19.63 1883Total (36.18) 9594

Node 3

Category % nAll other ages 76.54 376840 - 49 year olds 23.46 1155Total (18.56) 4923

Node 8Category % nAll other ages 84.41 394340 - 49 year olds 15.59 728Total (17.61) 4671

Node 7

Age Dichotomous - 40 - 49 year olds and other

Number of Enrollments in PE CoursesAdj. P-value=0.0000, Chi-square=633.4093, df=1

>Did Not Enroll

Number of Enrollments in HS CoursesAdj. P-value=0.0000, Chi-square=90.0277, df=1

>Did Not Enroll

Number of Enrollments at CCFCAdj. P-value=0.0000, Chi-square=17.5269, df=1

>Did Not Enroll<=Did Not Enroll

<=Did Not Enroll

Number of Enrollments in SSS CoursesAdj. P-value=0.0000, Chi-square=32.2187, df=1

>Did Not Enroll<=Did Not Enroll

<=Did Not Enroll

Number of Enrollments in LIB CoursesAdj. P-value=0.0000, Chi-square=285.0683, df=1

>Did Not Enroll

Number of Enrollments in SU00Adj. P-value=0.0000, Chi-square=52.3692, df=1

>Did Not Enroll<=Did Not Enroll

<=Did Not Enroll

Number of Enrollments in Credit CoursesAdj. P-value=0.0000, Chi-square=94.2465, df=1

>Did Not Enroll<=Did Not Enroll

Page 21: Presented by Keith Wurtz Senior Research Analyst Chaffey Community College Keith.wurtz@chaffey

Segmentation Modeling ResultsNodes n % Gain:

nGain (%)

Resp: %

Index (%)

8 – MORE likely to not enroll in a PE Course, MORE likely to not enroll in a library course, MORE likely to enroll in credit course

4,923 18.6 1,155 37.9 23.5 204.4

7 –MORE likely to not enroll in a PE Course, MORE likely to not enroll in a library course, LESS likely to enroll in credit course

4,671 17.6 728 23.9 15.6 135.8

10 – MORE likely to not enroll in a PE Course, LESS likely to enroll in a library course, MORE likely to enroll in Summer

2,812 10.6 320 10.5 11.4 99.1

11 - LESS likely to enroll in a PE Course, MORE likely to enroll in a HS course, Less likely to not enroll in SSS course

4,940 18.6 443 14.6 9.0 78.1

13 – LESS likely to enroll in PE Course, LESS likely to enroll in HS course, MORE likely to not enroll at Fontana

247 0.9 21 0.7 8.5 74.1

9 - MORE likely to not enroll in a PE Course, LESS likely to enroll in a library course, LESS likely to not enroll in Summer

2,161 8.1 119 3.9 5.5 48.0

12 - LESS likely to enroll in a PE Course, MORE likely to enroll in a HS course, Less likely to enroll in SSS course

1,928 7.3 94 3.1 4.9 42.5

14 - LESS likely to enroll in PE Course, LESS likely to enroll in HS course, MORE likely to enroll at Fontana

4,839 18.2 164 5.4 3.4 29.5

Note. N is the number of all cases in the node. % is the percent of all cases in the node. Gain:n is the number of all cases with the target response (i.e. 40-49 year olds). Gain:% is the percent of all cases (e.g.: 1,155/3,044=37.9) with the target response. Resp:% represents the proportion of cases in the node that have the target response (e.g.:1,155/4,923=23.5%). Index(%) gives a measure of how the number of target responses in the node compares to that for the entire sample (e.g.: 37.9%/18.6%=204.4%).

Page 22: Presented by Keith Wurtz Senior Research Analyst Chaffey Community College Keith.wurtz@chaffey

Using Information to Develop Marketing Plan

• Now that we know that 40 – 49 year olds prefer the following types of courses– MORE likely to not enroll in a PE Course, MORE

likely to not enroll in a library course, MORE likely to enroll in credit course

– MORE likely to not enroll in a PE Course, MORE likely to not enroll in a library course, LESS likely to enroll in credit course

• We can back to SPSS Base and identify which courses that meet this criteria

Page 23: Presented by Keith Wurtz Senior Research Analyst Chaffey Community College Keith.wurtz@chaffey

Courses Preferred by40 – 49 Year Olds

• Of the 8,849 enrollments that met the previously stated criteria– 13% or 1,127 of these enrollments were in Computer Information Systems courses

• 310 of these enrolments were in CIS-1 (Introduction to Computer Information)• 116 were in CIS-68I (Using the Internet)• 91 were in CIS-404 (Fundamentals of Microsoft Windows)

– 11% or 937 of these enrollments were in Disabilities Programs and Services courses• Most of these enrollments were in the independent living courses

– 8% or 708 of these enrollments were in Business and Office Technologies courses• 120 of these were in BUSOT-40A (Beginning Computer Keyboarding)• 99 were in BUSOT-46A (Beginning Microsoft Word)

– 7% or 620 of these enrollments were math courses• 190 of these were in MATH-410 (Elementary Algebra)• 99 were in MATH-420 (Intermediate Algebra)• 83 were in MATH-25 (College Algebra)• 72 were in MATH-520 (Arithmetic and Preparation for Algebra)• 64 were in MATH-510 (Arithmetic)

– 4% or 347 of these enrollments were in Child Development Education courses• 39 of these were in CDE-4 (Child, Family, and Community)• These enrollments were very spread out in mostly transferable courses

– 3% or 288 of these enrollments were in ESL courses

Page 24: Presented by Keith Wurtz Senior Research Analyst Chaffey Community College Keith.wurtz@chaffey

Ontario

Fontana

Chino Hills

Chino

UplandRancho Cucamonga

Montclair

San Antonio Heights

I 10

I 15

STATE HWY 66

STATE HWY 30

STATE HWY 60

STA

TE

HW

Y 8

3

STATE HW

Y 71

HA

VE

N A

VE

BASE LINE ST W BASE LINE RD

HA

VE

N A

VE

0 2.5 51.25 MilesPrepared by Keith WurtzDate: 20060406

2000 US Census Population Data inthe Chaffey College District

Number of 40-49 Year Olds

tgr06071grp00.All40t49

3 - 69

70 - 121

122 - 169

170 - 215

216 - 275

276 - 344

345 - 439

440 - 598

599 - 1007

1008 - 1653

Page 25: Presented by Keith Wurtz Senior Research Analyst Chaffey Community College Keith.wurtz@chaffey

Ontario

Fontana

Chino Hills

Chino

UplandRancho Cucamonga

Montclair

San Antonio Heights

I 10

I 15

STATE HWY 66

STATE HWY 30

STATE HWY 60

STA

TE

HW

Y 8

3

STATE HW

Y 71

HA

VE

N A

VE

BASE LINE ST W BASE LINE RD

HA

VE

N A

VE

Main Campus

Chino Center

Fontana Center

Ontario Center

Chino Campus

0 2 41 MilesPrepared by Keith WurtzDate: 20060408

2000 US Census Population Data andChaffey Students who are 40 - 49 Years Old

Number of 40-49 Year Olds

3 - 69

70 - 121

122 - 169

170 - 215

216 - 275

276 - 344

345 - 439

440 - 598

599 - 1007

1008 - 1653

2000 - 2001 40-49 Year Olds

^̀ Chaffey Locations

_̂ Chino Campus

Page 26: Presented by Keith Wurtz Senior Research Analyst Chaffey Community College Keith.wurtz@chaffey

Ontario

Fontana

Chino Hills

Chino

UplandRancho Cucamonga

Montclair

San Antonio Heights

Legend

Spring 2006 Alta Loma Students

Spring 2006 Etiwanda Students

Spring 2006 Rancho Students

Spring 2006 Upland Students

Spring 2006 Fontana Students

Spring 2006 Ontario Students

Spring 2006 Montclair Students

Spring 2006 Chino Students

Spring 2006 Chino Hills Students

Spring 2006 Area # N % Alta Loma 2,300 Etiwanda 780 Rancho 1,715

Rancho Cucamonga Total 4,795 161,830 3.0% Upland 1,490 73,697 2.0% Fontana 3,814 160,015 2.4% Ontario 2,259 170,373 1.3% Montclair 380 35,530 1.1% Chino 869 76,070 1.1% Chino Hills 337 77,819 0.4% Total 13,944 755,334 1.8%

Note. Participation rates are misleading because the “N” includes 2005 estimates from the California Department of Finance of every person in the city. For example, all those under 18 and over 65 are included.

Page 27: Presented by Keith Wurtz Senior Research Analyst Chaffey Community College Keith.wurtz@chaffey

®v

c: ®q

c:

2,200 ProjectedDwellings

Industry Industry Industry

Industry

Golf

UndevelopedArea

Undeveloped

Areac:

"The Preserve"9,800 Projected

Dwellings

"ThePreserve"

30,000 Dwellings in OntarioUnder Construction

Undeveloped

n

n

Chino Campus

Chino Center

Canyon Ridge HospitaCHINO AVE

EDISON AVE

SCHAEFER AVE

RIVERSIDE DR CE

NT

RA

L A

VE

RA

MO

NA

AV

E

MO

UN

TAIN

AV

E

POMONA FRWY

EU

CLI

D A

VE

S E

UC

LID

AV

E

Legend

Chino

Spring 2006 Chaffey Students Living in Chino

Portion of Spring 2006 Chaffey Students Living in Ontario

Page 28: Presented by Keith Wurtz Senior Research Analyst Chaffey Community College Keith.wurtz@chaffey

ReferencesBorges, Guilherme and Cherpitel, Cheryl. (2001). Selection of screening items for alcohol abuse

dependence among Mexican and Mexican Americans in the emergency department. Journal of Studies on Alcohol, 62, 277-.