TYPES OF DATA - Full of my life with mathematics only€¦ · TYPES OF DATA 4 Quantitative Data ......
Transcript of TYPES OF DATA - Full of my life with mathematics only€¦ · TYPES OF DATA 4 Quantitative Data ......
TYPES OF DATA
IS STATISTICS 100% CORRECT?
2
Secondary Data Compilation
Observation
Experimentation
Print or Electronic
Survey
Primary
Data Collection
DATA SOURSES
3
3
Data
Categorical
Numerical
Discrete Continuous
Examples:
Marital Status
Political Party
Eye Color
(Defined categories) Examples:
Number of Children
Defects per hour
(Counted items)
Examples:
Weight
Voltage
(Measured characteristics)
TYPES OF DATA
4
Quantitative Data (Numerical) consists of
numbers representing counts or
measurements.
Qualitative Data (Categorical) can be
separated into different categories that are
distinguished by some nonnumeric
characteristic.
DEFINITIONS
5
Discrete Data result when the number of
possible values is either a finite number or
a “countable” number.
Continuous Data result from infinitely
many possible values that correspond to
some continuous scale that covers a range
of values without gaps.
DEFINITIONS
6
A variable - a characteristic of a population or a sample, e.g.
Examination marks
Stock price
The waiting time for medical services
Data - Observed values of variables
WHAT IS A VARIABLE?
7
EXAMPLE
8
Data - Observed values of variables
46 49 46 48 45 49 46 45 47 43
45 46 44 47 44 45 49 46 42 47
46 44 42 45 46 46 42 45 41 47
48 43 43 49 40 44 46 43 45 44
41 47 43 47 48 42 44 48 48 45
Scores on a Test
TYPES OF VARIABLES
A. Qualitative or Attribute variable - the characteristic being studied is nonnumeric.
EXAMPLES: Gender, religious affiliation, type of automobile owned, state of birth, eye color are examples.
B. Quantitative variable - information is reported numerically.
EXAMPLES: balance in your checking account, minutes remaining in class, or number of children in a family.
9
QUANTITAIVE VARIABLES –
Classifications
Quantitative variables can be classified as either discrete or continuous.
A. Discrete variables: can only assume certain values and there are usually “gaps” between values.
EXAMPLE: the number of bedrooms in a house, or the number of hammers sold at the local
Home Depot (1,2,3,…,etc).
B. Continuous variable can assume any value within a specified range.
EXAMPLE: The pressure in a tire, the weight of a pork chop, or the height of students in a
class.
10
SUMMARY: TYPES OF VARIABLES
11
Scales of Measurement
1. Nominal Scale
Categorical/qualitative observations
Use number to represent the categories.
Example: Single=1, Married=2
2. Ordinal Scale
Ordered categorical observations
Value are in order
Example: Poor-1 Fair-2 Good-3
3. Interval Scale
Numerical/quantitative observations
Numerical bring the meaning of value.
Example: marks, temperature, IQ
4. Ratio Scale
Numerical/quantitative observations
Have absolute zero value
Example: weight, height, income
SCALES OF MEASUREMENT
12
SCALES OF MEASUREMENT
Nominal level – data that is classified into categories and cannot be arranged in any particular order.
EXAMPLES: eye color, gender, religious affiliation.
Ordinal level – involves data arranged in some order, but the differences between data values cannot be determined or are meaningless.
EXAMPLE: During a taste test of 4 soft drinks, Mellow Yellow was ranked number 1, Sprite number 2, Seven-up number 3, and Orange Crush number 4.
Interval level – similar to the ordinal level, with the additional property that meaningful amounts of differences between data values can be determined. There is no natural zero point.
EXAMPLE: Temperature on the Fahrenheit scale.
Ratio level – the interval level with an inherent zero starting point. Differences and ratios are meaningful for this level of measurement.
EXAMPLES: Monthly income of surgeons, or distance traveled by manufacturer’s representatives per month.
13
Nominal Scale is characterized by data
that consists of names, labels, or
categories only.
Ordinal Scale data can be arranged in
some order, but differences between data
values either cannot be determined or are
meaningless.
DEFINITIONS
14
Interval Scale is like the ordinal scale, with additional property that the difference between any two data values is meaningful. However, data at this level do not have a natural zero starting point.
Ratio Scale is similar to the interval scale with additional property that there is an absolute zero (where zero indicates that none of the quantity is present). In this scale ratios are meaningful.
DEFINITIONS
15
SUMMARY: SCALES OF
MEASUREMENT
16
Ratio/Interval data
Age - income 55 75000
42 68000
. .
. . Weight
gain +10
+5 . .
Nominal
Person Marital status Ahmad married
Siva single
Ah Keong single . . . . Computer Brand
1 IBM
2 Dell
3 IBM . . . .
EXAMPLES
17
Ratio/Interval data
Age - income 55 75000
42 68000
. .
. . Weight
gain +10
+5 . .
Nominal
With nominal data,
all we can do is,
calculate the proportion
of data that falls into
each category.
IBM Dell Compaq Other Total
25 11 8 6 50
50% 22% 16% 12%
EXAMPLES
18
Knowing the type of data is necessary to properly select the
suitable technique to be used when analyzing data.
Type of analysis allowed for each type of data
Ratio/Interval data – arithmetic calculations/Average
67,74,71,83,93,55,48,82,68,62
Average=70.3
Nominal data – counting the number of observation/
frequency in each category
Single:1 ,Married:2 Divorced:3, Widowed:4
Data record: 1,2,2,2,4,1,2,2,1,3
Average=2.0; Does this mean average person is
married????
TYPES of DATA TYPES of
ANALYSIS
19
Solution of Nominal data Category Code Frequency
Single 1 3
Married 2 5
Divorced 3 2
Widowed 4 4
Ordinal data - computations based on an ordering process
TYPES of DATA TYPES of
ANALYSIS
20
Ratio/Interval*
Values are real numbers
All calculations are valid
Data may be treated as ordinal or nominal
Example : Examination Marks
Ordinal
Value must represent the ranked order of the data
Calculation based on an ordering process are valid
Data may be treated as nominal but not as interval
Nominal
Value are the arbitrary numbers that represent categories.
Only calculation based on the frequencies of occurrence are valid.
Data may not be treated as ordinal or interval
*Higher-level data type may be treated as lower-level ones.
HIERARCHY OF DATA
21
This is often a preferred source of data due to low cost and convenience.
Published data is found as printed material, tapes, disks, and on the Internet.
Data published by the organization that has collected it is called PRIMARY DATA
For example: Data published by the US
Bureau of Census.
– Data published by an organization different than the
organization that has collected it is called
SECONDARY DATA.
For example: •The Statistical abstracts of the United States,
compiles data from primary sources
• Compustat, sells variety of financial data tapes
compiled from primary sources
PUBLISHED DATA
22
Observational study is one in which measurements
representing a variable of interest are observed and
recorded, without controlling any factor that might
influence their values.
Experimental study is one in which measurements
representing a variable of interest are observed and
recorded, while controlling factors that might influence
their values.
When published data is unavailable, one
needs to conduct a study to generate the
data.
OBSERVATIONAL or
EXPERIMENTAL
23
Statistical
Studies
Do you
make observations
only, or do you modify the
subjects?
Experiment Observational
When
observations
are made?
Retrospective
study
Prospective
study
Cross-sectional
study
Past
At
one
point
Future Design:
1. Control effects of variables
2. Use replication
3. Use randomization
STATISTICAL STUDIES
24
IS STATISTICS 100% CORRECT?
25
Voluntary Response Sample (or self-
selected sample) is one in which the
respondents themselves decide whether to
be included in the sample.
Voluntary response sample might not be
representative of the intended population.
DEFINITIONS
26
Surveys solicit information from people.
Surveys can be made by means of
personal interview
telephone interview
self-administered questionnaire
SURVEYS
27
A good questionnaire must be well designed:
• Keep the questionnaire as short as possible.
• Ask short,simple, and clearly worded questions.
• Start with demographic questions to help
respondents get started comfortably.
• Use dichotomous and multiple choice questions.
• Use open-ended questions cautiously.
• Avoid using leading-questions.
• Pretest a questionnaire on a small number of people.
• Think about the way you intend to use the
collected data when preparing the questionnaire.
QUESTIONNAIRE
28
IS STATISTICS 100% CORRECT?
29