Presented by Keith Wurtz Senior Research Analyst Chaffey Community College Keith.wurtz@chaffey
-
Upload
barrett-sharp -
Category
Documents
-
view
41 -
download
1
description
Transcript of Presented by Keith Wurtz Senior Research Analyst Chaffey Community College Keith.wurtz@chaffey
Using Census Data, District Data, with GIS, SPSS, and Answer Tree to Identify possible populations to market to, and
increase enrollments
Presented by Keith Wurtz
Senior Research Analyst
Chaffey Community College
Introduction
• How to Create a District Map?
• How to Merge Census Data into a GIS Map?
• Using Census Data and District Data to Identify possible populations to market to, and increase enrollments
Ontario
Fontana
Chino Hills
Chino
UplandRancho Cucamonga
Montclair
San Antonio Heights
0 3 61.5 MilesPrepared by Keith WurtzDate: 20060406
Chaffey College District
How do I create a map of my District in GIS?
• Open ArcMap• Click on the “+” sign to add data
– The data that you want to first add is the form of shape files– Shape files are the type of files the GIS uses to create maps– Since Chaffey’s District is in San Bernardino County (#71) I am going
to start with shape files from that county– Shape File Types
• BLK – Data by Census Blocks• GRP – Data by Block Groups• TRT – Census Tract• ZCTA – Zip Code• Place – Cities• CTY – County• LKA - Streets
– To create the map on the previous slide I am going to choose the Place file or data by City (i.e. tgr06071place00.shp) and the ZCTA file or zip code data (i.e. tgr06071zcta5cu.shp)
Creating District Map (Continued)
• Once the zip code and city shape files have been inserted you can see where the zip codes and the cities are in your county– Next double click on the shaded rectangles under layers and
choose “Hollow” and “OK” for zip codes and city– I am interested in the southwest portion of the county where our
District is located.– Note. You can highlight the layer by checking or un-checking the
boxes– To highlight this click on the magnifying glass and highlight this
portion of the county• Double click on the place shape file and choose labels
– Check Label features in this layer and choose “OK”– Now you can see each city in the county as well as the cities in
Chaffey’s District
Creating District Map (Continued)
• Select only the cities in Chaffey’s District– Click on the black arrow “ “ (i.e. “Select Features” icon– Click on each city and hold the control key down– Right mouse click on the place shape file and choose “Selection”
and “Create Layer from Selected Features”– Un-check the place shape file– Turn on the label features on the District Layer that you just
created and make it Hollow
• Double Click on the Chaffey District Layer that you created and change the font color of the city names
• Double Click on the Chaffey District rectangle under Layers and change the line color
Creating District Map (Continued)
• Click on View and then Layout View– Go to View and then Data Frame Properties
• Choose Frame• Click on Color and choose No Color
– Insert a Title by clicking Insert and Title– Insert Text by clicking Insert and Text– Insert Scale Bar by clicking Insert and Scale Bar
• Notice that scale is in decimal degrees• To change this, double click on it and under the Scale and
Units Tab choose Division Units and then choose Miles
Ontario
Fontana
Chino Hills
Chino
UplandRancho Cucamonga
Montclair
San Antonio Heights
0 3 61.5 MilesPrepared by Keith WurtzDate: 20060406
2000 US Census Population Data inthe Chaffey College District
Legend
2000 US Census Population Data
Census2-SF1.TOTPOP
25 - 496
497 - 817
818 - 1088
1089 - 1371
1372 - 1713
1714 - 2259
2260 - 3025
3026 - 4552
4553 - 7658
7659 - 11889
Inserting and Using Census Data into District Map
• Obtaining US Census Data– Go to the following: http://www.census.gov/– Click on “American Fact Finder”– Go to “Data Sets” and click on “Decennial Census” (Note: American
Community Survey)– Click on “Detailed Tables” under SF 1– Click on “geo within geo”
• Under “Show me all” click on “Block Groups”• Under “Within” click on “County”• Under “Select a State” click on “CA”• Under “Select a County” click on “San Bernardino”• Under “Select one or more…” click on “All Block Groups” and click on “Add”
– Click “Next”– Under “Select one or more…” click on “P1. Total Population” and click
on “Add” (Note. You can choose more than one and it will still work)– Click on “Show Result”
File Downloaded fromthe Census Bureau
• Click on “Print/Download” and click on “Download”– Choose Excel, hold down the
control key, and click “OK”– Click on “Open” and then open
the Excel file with “…data…” in the name. It is usually the largest file, but not always.
• Change the GEOGRAPHY_ field name to JOINID. This is the field that is going to match with the STFID
• Save this file as a dbf file.
Merging or Joining in GIS• Need to insert the block groups data for San Bernardino County
– Click on the “+” sign and insert the tgr06071grp00.shp file• Right Click on “Block Groups” in GIS because this is at the level in
which I downloaded the Census data– Go to “Join and Relates” and Click on “Join”– Make sure that the layer joining from is an attributes table– In Number 1 Choose STFID– In Number 2 Click on Folder and find the dbf file that you created:
RPPop.dbf– In Number 3 choose the “Join ID” field, click OK, and then Click “Yes”
• Check to see that the “Join” worked by right clicking on block groups, clicking on “Open Attribute Table,” and scrolling to the right to see if the total population field is there
Snap of Block Group Data that includes Total Population
• In this case there is a “Null” field in the top row.
• This indicates that the join did not work for this block group
• San Bernardino County is one of the few counties in the Country that has an error in one of its block groups
Incorporating US Census Data (Continued)
• Selecting Block Groups in Chaffey’s District– Right mouse click on the block group and open the attribute
table– Click on the Select Features icon: – Make sure that the block groups rectangle is checked so that
you can see the block groups– Click to the left of the map and select the district
• Once you have selected most of the district it is best to open the attributes table to make sure that all of the block groups have been selected
• Right mouse click on the block group and open the attributes table• Hold the control key down and select the block groups by clicking on
the row– Right mouse click on the block group and choose “Selection”
and choose “Create layer from Selected Features”– Uncheck the initial block group
Incorporating US Census Data (Continued)
• Displaying the US Population Data– Double click on the Layer that we just created– Click on Symbology– Click on Quantities– In the Value Field choose the total population
field and change classes to 10 instead of 5– Click OK
Using Census Data and District Data to Identify possible populations to market
to and increase enrollments
Participation Rates of 2000 – 2001 Chaffey Students by Age
Age # N %
18 – 19 Years 4,018 21,968 18.3
20 – 24 Years 6,066 50,091 12.1
25 – 29 Years 2,395 50,686 4.7
30 – 34 Years 1,752 57,004 3.1
35 – 39 Years 1,605 64,342 2.5
40 – 49 Years 2,304 107,749 2.1
50 – 65 Years 1,071 78,606 1.4
Total 19,211 430,446 4.5Note. # refers to the number of students attending Chaffey in the 2000 – 2001 academic year. N refers to the population living in the Chaffey College District taken from the 2000 US Census.
Marketing to 40 – 49 Year Olds
• US Census Data allows us to identify the number of 40 – 49 year olds living in each block group
• We can use the mapping software to identify where 40 – 49 year olds live
• Once we know where they live, we can use segmentation modeling (i.e. answer tree or classification tree) to identify enrollment characteristics of these students and then market to them
Segmentation Modeling
• According to Borges and Cherpitel (2001), segmentation modeling (i.e. classification tree models) are based on the principle of binary recursive partitioning. Binary recursive partitioning is where the values of the dependent variable (i.e. success and non-success) are examined for all possible splits of the data at each step of the tree-building process to find the split that most effectively separates the dependent variable into homogeneous groups until it is not possible to continue (Borges and Cherpitel, 2001). The model attempts to maximize the number of students who are correctly classified as successes and those who are correctly classified as non-successes.
Enrollment Variables used in Segmentation Model
• Used MIS to identify enrollment characteristics– Transfer course enrollment– Basic skills course enrollment– Occupational course enrollment– Credit course enrollment– School– Location of course– Term
• Created field for each one that generated number of enrollments aggregated by student
Category % nAll other ages 88.52 2347740 - 49 year olds 11.48 3044Total (100.00) 26521
Node 0
Category % nAll other ages 93.96 1123240 - 49 year olds 6.04 722Total (45.07) 11954
Node 2
Category % nAll other ages 96.36 490140 - 49 year olds 3.64 185Total (19.18) 5086
Node 6
Category % nAll other ages 96.61 467540 - 49 year olds 3.39 164Total (18.25) 4839
Node 14Category % nAll other ages 91.50 22640 - 49 year olds 8.50 21Total (0.93) 247
Node 13
Category % nAll other ages 92.18 633140 - 49 year olds 7.82 537Total (25.90) 6868
Node 5
Category % nAll other ages 95.12 183440 - 49 year olds 4.88 94Total (7.27) 1928
Node 12Category % nAll other ages 91.03 449740 - 49 year olds 8.97 443Total (18.63) 4940
Node 11
Category % nAll other ages 84.06 1224540 - 49 year olds 15.94 2322Total (54.93) 14567
Node 1
Category % nAll other ages 91.17 453440 - 49 year olds 8.83 439Total (18.75) 4973
Node 4
Category % nAll other ages 88.62 249240 - 49 year olds 11.38 320Total (10.60) 2812
Node 10Category % nAll other ages 94.49 204240 - 49 year olds 5.51 119Total (8.15) 2161
Node 9
Category % nAll other ages 80.37 771140 - 49 year olds 19.63 1883Total (36.18) 9594
Node 3
Category % nAll other ages 76.54 376840 - 49 year olds 23.46 1155Total (18.56) 4923
Node 8Category % nAll other ages 84.41 394340 - 49 year olds 15.59 728Total (17.61) 4671
Node 7
Age Dichotomous - 40 - 49 year olds and other
Number of Enrollments in PE CoursesAdj. P-value=0.0000, Chi-square=633.4093, df=1
>Did Not Enroll
Number of Enrollments in HS CoursesAdj. P-value=0.0000, Chi-square=90.0277, df=1
>Did Not Enroll
Number of Enrollments at CCFCAdj. P-value=0.0000, Chi-square=17.5269, df=1
>Did Not Enroll<=Did Not Enroll
<=Did Not Enroll
Number of Enrollments in SSS CoursesAdj. P-value=0.0000, Chi-square=32.2187, df=1
>Did Not Enroll<=Did Not Enroll
<=Did Not Enroll
Number of Enrollments in LIB CoursesAdj. P-value=0.0000, Chi-square=285.0683, df=1
>Did Not Enroll
Number of Enrollments in SU00Adj. P-value=0.0000, Chi-square=52.3692, df=1
>Did Not Enroll<=Did Not Enroll
<=Did Not Enroll
Number of Enrollments in Credit CoursesAdj. P-value=0.0000, Chi-square=94.2465, df=1
>Did Not Enroll<=Did Not Enroll
Segmentation Modeling ResultsNodes n % Gain:
nGain (%)
Resp: %
Index (%)
8 – MORE likely to not enroll in a PE Course, MORE likely to not enroll in a library course, MORE likely to enroll in credit course
4,923 18.6 1,155 37.9 23.5 204.4
7 –MORE likely to not enroll in a PE Course, MORE likely to not enroll in a library course, LESS likely to enroll in credit course
4,671 17.6 728 23.9 15.6 135.8
10 – MORE likely to not enroll in a PE Course, LESS likely to enroll in a library course, MORE likely to enroll in Summer
2,812 10.6 320 10.5 11.4 99.1
11 - LESS likely to enroll in a PE Course, MORE likely to enroll in a HS course, Less likely to not enroll in SSS course
4,940 18.6 443 14.6 9.0 78.1
13 – LESS likely to enroll in PE Course, LESS likely to enroll in HS course, MORE likely to not enroll at Fontana
247 0.9 21 0.7 8.5 74.1
9 - MORE likely to not enroll in a PE Course, LESS likely to enroll in a library course, LESS likely to not enroll in Summer
2,161 8.1 119 3.9 5.5 48.0
12 - LESS likely to enroll in a PE Course, MORE likely to enroll in a HS course, Less likely to enroll in SSS course
1,928 7.3 94 3.1 4.9 42.5
14 - LESS likely to enroll in PE Course, LESS likely to enroll in HS course, MORE likely to enroll at Fontana
4,839 18.2 164 5.4 3.4 29.5
Note. N is the number of all cases in the node. % is the percent of all cases in the node. Gain:n is the number of all cases with the target response (i.e. 40-49 year olds). Gain:% is the percent of all cases (e.g.: 1,155/3,044=37.9) with the target response. Resp:% represents the proportion of cases in the node that have the target response (e.g.:1,155/4,923=23.5%). Index(%) gives a measure of how the number of target responses in the node compares to that for the entire sample (e.g.: 37.9%/18.6%=204.4%).
Using Information to Develop Marketing Plan
• Now that we know that 40 – 49 year olds prefer the following types of courses– MORE likely to not enroll in a PE Course, MORE
likely to not enroll in a library course, MORE likely to enroll in credit course
– MORE likely to not enroll in a PE Course, MORE likely to not enroll in a library course, LESS likely to enroll in credit course
• We can back to SPSS Base and identify which courses that meet this criteria
Courses Preferred by40 – 49 Year Olds
• Of the 8,849 enrollments that met the previously stated criteria– 13% or 1,127 of these enrollments were in Computer Information Systems courses
• 310 of these enrolments were in CIS-1 (Introduction to Computer Information)• 116 were in CIS-68I (Using the Internet)• 91 were in CIS-404 (Fundamentals of Microsoft Windows)
– 11% or 937 of these enrollments were in Disabilities Programs and Services courses• Most of these enrollments were in the independent living courses
– 8% or 708 of these enrollments were in Business and Office Technologies courses• 120 of these were in BUSOT-40A (Beginning Computer Keyboarding)• 99 were in BUSOT-46A (Beginning Microsoft Word)
– 7% or 620 of these enrollments were math courses• 190 of these were in MATH-410 (Elementary Algebra)• 99 were in MATH-420 (Intermediate Algebra)• 83 were in MATH-25 (College Algebra)• 72 were in MATH-520 (Arithmetic and Preparation for Algebra)• 64 were in MATH-510 (Arithmetic)
– 4% or 347 of these enrollments were in Child Development Education courses• 39 of these were in CDE-4 (Child, Family, and Community)• These enrollments were very spread out in mostly transferable courses
– 3% or 288 of these enrollments were in ESL courses
Ontario
Fontana
Chino Hills
Chino
UplandRancho Cucamonga
Montclair
San Antonio Heights
I 10
I 15
STATE HWY 66
STATE HWY 30
STATE HWY 60
STA
TE
HW
Y 8
3
STATE HW
Y 71
HA
VE
N A
VE
BASE LINE ST W BASE LINE RD
HA
VE
N A
VE
0 2.5 51.25 MilesPrepared by Keith WurtzDate: 20060406
2000 US Census Population Data inthe Chaffey College District
Number of 40-49 Year Olds
tgr06071grp00.All40t49
3 - 69
70 - 121
122 - 169
170 - 215
216 - 275
276 - 344
345 - 439
440 - 598
599 - 1007
1008 - 1653
^̀
^̀
^̀
^̀
_̂
Ontario
Fontana
Chino Hills
Chino
UplandRancho Cucamonga
Montclair
San Antonio Heights
I 10
I 15
STATE HWY 66
STATE HWY 30
STATE HWY 60
STA
TE
HW
Y 8
3
STATE HW
Y 71
HA
VE
N A
VE
BASE LINE ST W BASE LINE RD
HA
VE
N A
VE
Main Campus
Chino Center
Fontana Center
Ontario Center
Chino Campus
0 2 41 MilesPrepared by Keith WurtzDate: 20060408
2000 US Census Population Data andChaffey Students who are 40 - 49 Years Old
Number of 40-49 Year Olds
3 - 69
70 - 121
122 - 169
170 - 215
216 - 275
276 - 344
345 - 439
440 - 598
599 - 1007
1008 - 1653
2000 - 2001 40-49 Year Olds
^̀ Chaffey Locations
_̂ Chino Campus
Ontario
Fontana
Chino Hills
Chino
UplandRancho Cucamonga
Montclair
San Antonio Heights
Legend
Spring 2006 Alta Loma Students
Spring 2006 Etiwanda Students
Spring 2006 Rancho Students
Spring 2006 Upland Students
Spring 2006 Fontana Students
Spring 2006 Ontario Students
Spring 2006 Montclair Students
Spring 2006 Chino Students
Spring 2006 Chino Hills Students
Spring 2006 Area # N % Alta Loma 2,300 Etiwanda 780 Rancho 1,715
Rancho Cucamonga Total 4,795 161,830 3.0% Upland 1,490 73,697 2.0% Fontana 3,814 160,015 2.4% Ontario 2,259 170,373 1.3% Montclair 380 35,530 1.1% Chino 869 76,070 1.1% Chino Hills 337 77,819 0.4% Total 13,944 755,334 1.8%
Note. Participation rates are misleading because the “N” includes 2005 estimates from the California Department of Finance of every person in the city. For example, all those under 18 and over 65 are included.
^̀
^̀
®v
c: ®q
c:
2,200 ProjectedDwellings
Industry Industry Industry
Industry
Golf
UndevelopedArea
Undeveloped
Areac:
"The Preserve"9,800 Projected
Dwellings
"ThePreserve"
30,000 Dwellings in OntarioUnder Construction
Undeveloped
n
n
Chino Campus
Chino Center
Canyon Ridge HospitaCHINO AVE
EDISON AVE
SCHAEFER AVE
RIVERSIDE DR CE
NT
RA
L A
VE
RA
MO
NA
AV
E
MO
UN
TAIN
AV
E
POMONA FRWY
EU
CLI
D A
VE
S E
UC
LID
AV
E
Legend
Chino
Spring 2006 Chaffey Students Living in Chino
Portion of Spring 2006 Chaffey Students Living in Ontario
ReferencesBorges, Guilherme and Cherpitel, Cheryl. (2001). Selection of screening items for alcohol abuse
dependence among Mexican and Mexican Americans in the emergency department. Journal of Studies on Alcohol, 62, 277-.