Measurement Variables Describing Distributions © 2014 Project Lead The Way, Inc. Computer Science...
-
Upload
egbert-price -
Category
Documents
-
view
214 -
download
0
Transcript of Measurement Variables Describing Distributions © 2014 Project Lead The Way, Inc. Computer Science...
Measurement Variables
Describing Distributions
© 2014 Project Lead The Way, Inc.Computer Science and Software Engineering
• A nearly perfect analogycontinuous : discreteanalog : digitalfloat : int
• Measurements of continuous variables are made discrete by "binning" them.
• How old are you? Time is continuous, but you answer in discrete, binned values.
Continuous vs. Discrete
• Categorical (e.g., zip codes)categories with no meaningful
order• Ordinal (e.g., rank in a race)
ordered, but increasing by 1 has no consistent meaning
• Interval (e.g., grade level)Ordered, with consistent steps up, but no meaning for "doubling" or "tripling"
• Ratio (e.g., height)Ordered, with "2 times" being
"double"
Levels of a Measurement Variable
Sample vs. Population• Population =
infinite pool of measurements, or all measurements possible
• Sample = subset of population
• Population parameters= population mean= population standard deviation
• These are inferred from data
Sample vs. Population• Sample
statistics = sample mean = sample standard deviation
• These describe data
Sample vs. Population• Infer population distribution from
sample histogram • Sample histogram matches parent
distribution better with large sample visualized with small intervals
• Half of the area under the distribution is to the left of the median
Median
Mean, Median, Mode
• y-axis shows values of the data• Splits data into quartiles
Box Plot
heig
ht
Each box contains 25% of the data
The IQR (Interquartile Range) Contains 50% of the Data
Whiskers extend to max and min… usually
Box Plot
Whiskers and Outliers Show max/min
The Range Contains 100% of the Data
• A family of distributions with very similar shape
• One normal distribution for each μ and σ
Normal Distributions
μ
σ
• μ ("mu") = population mean
• σ ("sigma") = population standard deviation
• One normal distribution for any pair μ , σ• Example: μ = 6 and σ = 2.2
A Normal Distribution
μ
σ
• μ ("mu") = population mean
• σ ("sigma") = population standard deviation
• μ = 0 and σ = 1
The Standard Normal Distribution
μ
σ
The Empirical Rule: 67% - 95% - 99.7%
67% area
95% area
99.7% area
values within μ ±
σ
values within μ ±
2σ
values within μ ± 3σ
Shape, Center, Spread
• These distributions are both positively-skewed because they are right-tailed
Shape, Center, Spread
Shape, Center, Spread