DATASTANDARDIZATION
and
CLASSIFICATION
Cartographic Design for GIS (Geog. 340)Prof. Hugh HowardAmerican River College
STANDARDIZATION
STANDARDIZATION• Normalization• Transformation of raw data values to
different, more meaningful values– To map densities instead of “raw” values
– To map proportions between variables
– To map other relationships between variables
– To map statistical summaries
MAPPING DENSITY• How much of a particular thing exists
within a given area• Larger enumeration units often have
"more" of a particular thing– Mapping density is not necessary if all
you want to do is show where “more” is– Accounting for the varying sizes of
enumeration units can be more revealing
MAPPING DENSITY
Population/Area
“persons per square mile”
MAPPING DENSITY
Bushels/Area
“bushels per acre”
MAPPING PROPORTIONS• Proportions represent the relationship
of a part to a whole• Several ways to express proportions
– Quotient: 0.0-1.0 – Percentage: 0-100%– Rate: 7 per 1,000
MAPPING PROPORTIONS
Persons 60 and Over/Total Persons*100
“percentage of seniors”
Persons 60 and Over
MAPPING PROPORTIONS
Non Grads/Total Population*100
“percentage of non grads”
MAPPING RELATIONSHIPS• It is often revealing to show how two
variables are related (in a manner that is not strictly proportional)
• Several ways to express relationships– Quotient: 0.0-infinity – Percentage: 0-infinity%– Rate: 1,500 per 100
MAPPING RELATIONSHIPS
Females/Males
“ratio of females to males”
MAPPING RELATIONSHIPS• It is often revealing to show how two
variables are related (in a manner that is not strictly proportional)
• Several ways to express relationships– Quotient: 0.0-infinity – Percentage: 0-infinity%– Rate: 1,500 per 100
MAPPING RELATIONSHIPS
Acres of Cropland/Population
“acres per 1,000 people”
MAPPING STAT. SUMMARIES• Enumeration units can be represented
according to calculated statistics– Median– Mean (average)– Standard Deviation, etc.
MAPPING STAT. SUMMARIES
Animation showing raw and standardized values
(slow version)
Animation showing raw and standardized values
(fast version)
STANDARDIZATION• Transformation of raw data values to
different, more meaningful values– Densities, Proportions, Relationships,
and Statistical Summaries
• In conjunction with data classification, normalization allows us to craft our message…
DATACLASSIFICATION
DATA CLASSIFICATION• The act of organizing attribute values
into categories, or groups• Can be qualitative or quantitative, and
based on any of the four measurement scales– Nominal– Ordinal – Interval– Ratio
0 - 500
501 - 1 ,000
1 ,001 - 1,500
RATI O(Popu lat i on )
2 .4 - 4 .7
4.8 - 6.3
6.4 - 8 .6
I NTE RVAL(Qu al i t y of L i fe)
Poor
Fai r
Good
ORD I NAL(Vi si b i l i t y)
Com m er ci al
Residen t i al
In du st r i al
NOM I NAL(Z on i n g)
DATA CLASSIFICATION
DATA CLASSIFICATION• One of the most interesting aspects of
thematic mapping– One set of attribute values can yield
many different maps, depending on the classification scheme
– The scheme you choose can strongly influence how your map is perceived
DATA CLASSIFICATION
DATA CLASSIFICATION• Animation showing population using
equal interval, quantile, and natural breaks classification methods
DATA CLASSIFICATION
There is no “best” method
Certain methods are not well suited to particular situations
DATA CLASSIFICATION• How many classes should you use?
– Anywhere from 3 to 7 – 5 is probably optimal– An odd # has a “middle” class
Difficult to differentiate large numbers of tints
DATA CLASSIFICATION• Animation showing agricultural sales
using 2, 4, and 6 classes
DATA CLASSIFICATION
DATA CLASSIFICATION• Equal Interval
– Each class occupies an equal interval along the number line, or histogram
TOWN POPULATION
No gaps between classes
DATA CLASSIFICATION• Advantages of Equal Interval
– Can be easy to understand and interpret– Good for attributes that are normally
represented using uniform classes: elevation, precipitation, temperature
0 – 2021 – 4041 – 6061 – 8081 – 100
DATA CLASSIFICATION• Disadvantage of Equal Interval
*
DATA CLASSIFICATION• *Considers distribution of data along a
number line (poor)– Doesn't work well with skewed
distributions (can result in empty classes)
DATA CLASSIFICATION• Quantile
– Each class contains the same (or similar) number of attribute values
4 classes: quartiles5 classes: quintiles6 classes: sextilesTOWN
Gaps between classes
POPULATION
DATA CLASSIFICATION• Advantage of Quantile
– Ensures that a choropleth map will have the same number of darkest polygons as lightest, etc.
≈13 Counties per Class
67 Counties5 Classes
DATA CLASSIFICATION• Disadvantage of Quantile
*
DATA CLASSIFICATION• *Considers distribution of data along a
number line (poor)– Doesn’t work well with skewed
distributions (one or two classes can occupy the majority of the range)
DATA CLASSIFICATION• Natural Breaks
– Each class contains clusters of attribute values, and “natural” breaks between
More subjectiveTOWN
Gaps between classes
POPULATION
DATA CLASSIFICATION• Advantage of Natural Breaks
*
DATA CLASSIFICATION• *Considers distribution of data along a
number line (very good)– Considers how the data are distributed
along the number line; each classification is “custom tailored”
– Works well with skewed data distributions
DATA CLASSIFICATION• Disadvantages of Natural Breaks
– Subjective, and results will differ– More difficult to compare with other maps– One or two classes can end up occupying
the majority of the data's range
DATA CLASSIFICATION• Classification for map comparison
– Use the same method for all maps (if possible)
– Equal interval with identical break values often works best (shown here)
– Quantile can also work well– By definition, natural breaks will result in
different classifications on different maps, making comparison difficult
DATASTANDARDIZATION
and
CLASSIFICATION
Cartographic Design for GIS (Geog. 340)Prof. Hugh HowardAmerican River College
Top Related