Descriptions of data statistics for research
-
Upload
harve-abella -
Category
Technology
-
view
2.478 -
download
0
Transcript of Descriptions of data statistics for research
![Page 1: Descriptions of data statistics for research](https://reader033.fdocuments.in/reader033/viewer/2022052823/5553889eb4c905ba078b46da/html5/thumbnails/1.jpg)
Descriptions of DataDescriptions of Data
Measures of Central TendencyMeasures of Central Tendency
Definition:Definition: A Measure of Central Tendency has been A Measure of Central Tendency has been defined as a statistic calculated from a set of defined as a statistic calculated from a set of observations or scores and designed to typify or observations or scores and designed to typify or represent that series. It is also defined as the tendency of represent that series. It is also defined as the tendency of the same observations or cases to cluster about a point, the same observations or cases to cluster about a point, with either to an absolute value or to a frequency of with either to an absolute value or to a frequency of occurrence; usually but not necessarily, about midway occurrence; usually but not necessarily, about midway between the extreme high and the extreme low values in between the extreme high and the extreme low values in the distribution.the distribution.
![Page 2: Descriptions of data statistics for research](https://reader033.fdocuments.in/reader033/viewer/2022052823/5553889eb4c905ba078b46da/html5/thumbnails/2.jpg)
Measures of Central Tendency
The Mean
Definition: The arithmetic mean or simply the mean is the average of a group of measures.
Characteristics of the mean
1. The arithmetic mean, or simply mean is the center of gravity
or balance point of a group of measures.
2. The mean is easily affected by a change in the magnitude of any of the measures.
![Page 3: Descriptions of data statistics for research](https://reader033.fdocuments.in/reader033/viewer/2022052823/5553889eb4c905ba078b46da/html5/thumbnails/3.jpg)
Characteristics of the MeanCharacteristics of the Mean
3. The mean is the most reliable measure of central tendency because it is always the center of gravity of any group of measures.
Uses of the Mean
Compute the mean when
1. the mean of a group of measures is needed.2. the center of gravity or balanced point of a group of
measures is wanted.3. every measure should have an effect upon the measure of
central tendency.
![Page 4: Descriptions of data statistics for research](https://reader033.fdocuments.in/reader033/viewer/2022052823/5553889eb4c905ba078b46da/html5/thumbnails/4.jpg)
Uses of the MeanUses of the Mean
Compute the mean when
4. the most reliable measure of central tendency is desired.
5. the group from which the mean has been derived is more or less homogeneous and a more realistic mean is desired. For instance, the mean of the measure 11, 12, 13, 50, and 64 is 30 which is very far from any of the measures and therefore not realistic.
6. other statistical measures involving the mean are to be computed. Examples of such measures are the standard deviation, coefficient of correlation, critical ratio, etc..
![Page 5: Descriptions of data statistics for research](https://reader033.fdocuments.in/reader033/viewer/2022052823/5553889eb4c905ba078b46da/html5/thumbnails/5.jpg)
Definition: The arithmetic mean or simply the mean of a data set is the sum of the values divided by the number of values. That is, if X1, X2, . . . , XN are the individual scores in a population of size N, then the population mean is defined as:
Definition: If X1, X2, . . . , Xn are the individual scores in a sample size n, then the sample mean is defined as:
N
XN
ii
1
X
n
XX
n
ii
1
![Page 6: Descriptions of data statistics for research](https://reader033.fdocuments.in/reader033/viewer/2022052823/5553889eb4c905ba078b46da/html5/thumbnails/6.jpg)
Example 1: Find the mean of the following scores: 4, 10, 7, 5, 9,7.
Example 2: A sample of n = 6 scores has a mean of M = 40. One new score is added to the sample and the new mean is found to be M = 42. What can you conclude about the value of the new score?
Definition: For group data or those which are placed in a frequency distribution table, the mean can be approximated by the following formula:
N
fX
n
fXX or
![Page 7: Descriptions of data statistics for research](https://reader033.fdocuments.in/reader033/viewer/2022052823/5553889eb4c905ba078b46da/html5/thumbnails/7.jpg)
Example: Consider the following frequency distribution table of the 15 graduate behavioral statistics students.
Classes Frequency
10 – 19 5
20 – 29 4
30 – 39 3
40 – 49 2
50 – 59 1
![Page 8: Descriptions of data statistics for research](https://reader033.fdocuments.in/reader033/viewer/2022052823/5553889eb4c905ba078b46da/html5/thumbnails/8.jpg)
The Weighted MeanThe Weighted Mean
Definition: The Weighted Mean is a variation of the arithmetic mean which assigns weight to the individual scores in a data set.
where - the weighted mean
- the weight
- the individual scores
- number of cases
n
ii
n
iii
W
XWXW
1
1
XW
iW
iX
n
![Page 9: Descriptions of data statistics for research](https://reader033.fdocuments.in/reader033/viewer/2022052823/5553889eb4c905ba078b46da/html5/thumbnails/9.jpg)
Example: Suppose we have determined the digit span for a brief time period) in thirty - seven – 4 year – olds. What is the mean digit span for our sample?
X f
6 2
5 7
4 17
3 5
2 3
1 2
0 1
![Page 10: Descriptions of data statistics for research](https://reader033.fdocuments.in/reader033/viewer/2022052823/5553889eb4c905ba078b46da/html5/thumbnails/10.jpg)
Example: Consider the following item in a questionnaire .
Do you agree that RH bill be implemented?
Please check your attitude.
_____ Strongly agree
_____ Agree
_____ Fairly agree
_____ Disagree
_____ Strongly disagree
Suppose 10 individuals were asked to answer the preceding question and the following responses are obtained:
3 - Strongly Agree, 4 – Agree, 2 – Disagree, and 1 – Strongly disagree. What is the average numerical response and its categorical equivalent?
![Page 11: Descriptions of data statistics for research](https://reader033.fdocuments.in/reader033/viewer/2022052823/5553889eb4c905ba078b46da/html5/thumbnails/11.jpg)
Note: Consider the following Hypothetical Mean Range for a 5 point scale categorical responses:
4.20 - 5.00 - Strongly Agree
3.40 - 4.19 - Agree
2.60 - 3.39 - Fairly Agree
1.80 - 2.59 - Disagree
1.00 - 1.79 - Strongly Disagree
![Page 12: Descriptions of data statistics for research](https://reader033.fdocuments.in/reader033/viewer/2022052823/5553889eb4c905ba078b46da/html5/thumbnails/12.jpg)
The MedianThe Median
Definition: The median is the middle most value in an ordered sequence of data.
Remark: The median is unaffected by any extreme observations in a set of data and hence, whenever an extreme observation is present, it is appropriate to use the median rather than the mean to describe a set of data.
Statistical Treatment: For an even number of observations:
22
2
2
nn XX
Md
![Page 13: Descriptions of data statistics for research](https://reader033.fdocuments.in/reader033/viewer/2022052823/5553889eb4c905ba078b46da/html5/thumbnails/13.jpg)
For an odd number of observations:
Example: A manufacturer of flashlight batteries took a sample of 13 from a day’s production and burned them continuously until they failed. The number of hours they burned were
342 426 317 545 264 451 1049
631 512 266 492 562 298.
Determine the median.
2
1 nXMd
![Page 14: Descriptions of data statistics for research](https://reader033.fdocuments.in/reader033/viewer/2022052823/5553889eb4c905ba078b46da/html5/thumbnails/14.jpg)
Example: The following data are the amount of calories in a 30 – gram serving for a random sample of 10 types of fresh – baked chocolate chip cookies.
_______________________________________________
Product Calories
_______________________________________________
Hillary Rodham Clinton’s 153
Original Nestle Toll House 152
Mrs. Fields 146
Stop and Shop 138
Duncan Hines 130
David’s 146
David’s Chocolate Chunk 149
Great American Cookie Company 138
What is the median amount of calories?
![Page 15: Descriptions of data statistics for research](https://reader033.fdocuments.in/reader033/viewer/2022052823/5553889eb4c905ba078b46da/html5/thumbnails/15.jpg)
The ModeThe Mode
Definition: The mode is the value in a set of data that appears most frequently. It may be obtained from an ordered array.
Remark: Unlike the arithmetic mean, the mode is not affected by the occurrence of any extreme values. However, the mode is used only for descriptive purposes because it is more variable from sample to sample than other measures of central tendency.
Example: Consider the out – of – state tuition rates for the six – school sample from Pennsylvania.
4.9 6.3 7.7 8.9 7.7 10.3 11.7
![Page 16: Descriptions of data statistics for research](https://reader033.fdocuments.in/reader033/viewer/2022052823/5553889eb4c905ba078b46da/html5/thumbnails/16.jpg)
The MidrangeThe Midrange
Definition: The midrange is the average of the smallest and largest observations in a set of data.
Statistical Treatment:
Remark: The midrange is often used as a summary measure both by financial analysts and by weather reporters, since it can provide an adequate, quick, and simple measure to characterize the entire data set – be it a series of daily closing stock prices over a whole year or a series of recorded hourly temperature readings over a whole day.
2argestlsmallest XX
Midrange
![Page 17: Descriptions of data statistics for research](https://reader033.fdocuments.in/reader033/viewer/2022052823/5553889eb4c905ba078b46da/html5/thumbnails/17.jpg)
Note: In dealing with data such as daily closing stock prices or hourly temperature readings, an extreme value is not likely to occur. Nevertheless, in most applications, despite its simplicity, the midrange must be used cautiously.
Remark: The midrange becomes distorted as a summary measure of central tendency if an outlier is present.
![Page 18: Descriptions of data statistics for research](https://reader033.fdocuments.in/reader033/viewer/2022052823/5553889eb4c905ba078b46da/html5/thumbnails/18.jpg)
Measures of Non-central LocationMeasures of Non-central Location
Definition: The measures of non-central location or fractiles are values below which a specified fraction or percentage of a given observation in a data set must fall.
Remark: The measures of non-central location are employed particularly when summarizing or describing the properties of large sets of numerical data
Types of Fractiles
Definition: The percentiles are the 99 score points which divide a distribution of scores into 100 equal parts.
Notation: where iP ni , 3, 2, ,1
![Page 19: Descriptions of data statistics for research](https://reader033.fdocuments.in/reader033/viewer/2022052823/5553889eb4c905ba078b46da/html5/thumbnails/19.jpg)
Ungrouped Data:
Formula:
observation of the data set
placed in array
where i = 1, 2, 3, . . . , 99.
Grouped Data:
Definition: The deciles are the 9 score points which divide the array of observations into 10 equal parts.
Ungrouped Data: score
where i = 1, 2, 3, . . . , 9
th
i
niP
100
1 theof value
f
CFin
cLCBPpre
Pi i
100
th
i
niD
10
1 theof value
![Page 20: Descriptions of data statistics for research](https://reader033.fdocuments.in/reader033/viewer/2022052823/5553889eb4c905ba078b46da/html5/thumbnails/20.jpg)
Grouped Data:
Definition: The quartiles are the 3 score points which divide the array of observations into 4 equal parts.
Ungrouped Data: observation of the
data set placed in array
where i = 1, 2, 3, . . . , 9
f
CFin
cLCBDpre
Di i
10
th
i
niQ
4
1 theof value
![Page 21: Descriptions of data statistics for research](https://reader033.fdocuments.in/reader033/viewer/2022052823/5553889eb4c905ba078b46da/html5/thumbnails/21.jpg)
Grouped Data:
f
CFin
cLCBQpre
Qi i
4
![Page 22: Descriptions of data statistics for research](https://reader033.fdocuments.in/reader033/viewer/2022052823/5553889eb4c905ba078b46da/html5/thumbnails/22.jpg)
Measures of VariationMeasures of Variation
Definition: Variation is the amount of dispersion or “spread” in the data.
Types of Measures of Variation
I. The Range – the difference between the largest and smallest
observations in a set of data.
Range = Xlargest - Xsmallest
![Page 23: Descriptions of data statistics for research](https://reader033.fdocuments.in/reader033/viewer/2022052823/5553889eb4c905ba078b46da/html5/thumbnails/23.jpg)
Remark: The range measures the total spread in the set of data. Although the range is a simple measure of total variation in the data, its distinct weakness is that it does not make into account how the data are actually distributed between the smallest and largest values.
The Inter - quartile Range
Definition: The inter – quartile range (also called midspread) is the difference between the third and first quartiles in a set of data.
Inter – quartile = Q3 – Q1
![Page 24: Descriptions of data statistics for research](https://reader033.fdocuments.in/reader033/viewer/2022052823/5553889eb4c905ba078b46da/html5/thumbnails/24.jpg)
The Variance and the Standard Deviation
- the measures of variation that takes into account on how all
the values in the data set are distributed.
- the measures evaluate how the values fluctuate about the
mean.
Statistical Treatment:
Population Standard Deviation:
Population Variance:
N
X i
N
i
2
1
N
XN
ii
1
2
2
![Page 25: Descriptions of data statistics for research](https://reader033.fdocuments.in/reader033/viewer/2022052823/5553889eb4c905ba078b46da/html5/thumbnails/25.jpg)
Sample Standard Deviation:
Sample Variance:
Computational Formula:
1
1
2
n
XXs
n
ii
1
1
2
2
n
XXs
n
ii
1
1
22
2
nn
XXns
n
iii
11
2
1
nn
XXn
s
n
i
n
iii
![Page 26: Descriptions of data statistics for research](https://reader033.fdocuments.in/reader033/viewer/2022052823/5553889eb4c905ba078b46da/html5/thumbnails/26.jpg)
Example: Consider again the out – of – state tuition rates for the six – school sample from Pennsylvania.
4.9 6.3 7.7 8.9 7.7 10.3 11.7
Determine the following:
1. Range
2. Inter – quartile Range
3. Standard Deviation
4. Variance
![Page 27: Descriptions of data statistics for research](https://reader033.fdocuments.in/reader033/viewer/2022052823/5553889eb4c905ba078b46da/html5/thumbnails/27.jpg)
The Coefficient of VariationThe Coefficient of Variation
Definition: The coefficient of variation is a relative measure of variation. It is expressed as a percentage rather than in terms of the units of the particular data.
Statistical Treatment:
%100
X
sCV
![Page 28: Descriptions of data statistics for research](https://reader033.fdocuments.in/reader033/viewer/2022052823/5553889eb4c905ba078b46da/html5/thumbnails/28.jpg)
Measures of SkewnessMeasures of Skewness
Definition: The measures of skewness show the degree of symmetry or asymmetry of a distribution and also indicate the direction of skewness.
Types of Skewness
I. Positively Skewed – has a longer tail to the right.
- more concentration of values below than above the mean.
- XMM d 0
![Page 29: Descriptions of data statistics for research](https://reader033.fdocuments.in/reader033/viewer/2022052823/5553889eb4c905ba078b46da/html5/thumbnails/29.jpg)
II. Negatively Skewed – has a longer tail to the left.
- more concentration of values above than below the mean.
-
Pearson’s Coefficient of Skewness - use to determine the direction of skewness.
Remark: a) If SK > 0, then the distribution is skewed to the right.
b) SK < 0, then the distribution of the data set is skewed to left.
c) If SK = 0, then the distribution is symmetric.
MoMdX
![Page 30: Descriptions of data statistics for research](https://reader033.fdocuments.in/reader033/viewer/2022052823/5553889eb4c905ba078b46da/html5/thumbnails/30.jpg)
Example: Consider again the out – of – state tuition rates for the six – school sample from Pennsylvania.
4.9 6.3 7.7 8.9 7.7 10.3 11.7
Determine the direction of skewness of the preceding data.
Measures of Kurtosis
Definition: The measures of kurtosis show the relative flatness or peakedness of a distribution.
![Page 31: Descriptions of data statistics for research](https://reader033.fdocuments.in/reader033/viewer/2022052823/5553889eb4c905ba078b46da/html5/thumbnails/31.jpg)
Types of Kurtosis
I. Platykurtic – a distribution which is relatively flat.
II. Mesokurtic – a distribution which is between platykurtic
and leptokurtic.
III. Leptokurtic – a usually peaked distribution.
Coefficient of Kurtosis – use to determine the relative flatness of peakedness of a distribution.
![Page 32: Descriptions of data statistics for research](https://reader033.fdocuments.in/reader033/viewer/2022052823/5553889eb4c905ba078b46da/html5/thumbnails/32.jpg)
Statistical Treatment:
Remark: a) Ku = 3, then the distribution is mesokurtic
b) Ku > 3, then the distribution is leptokurtic.
c) Ku < 3, then the distribution is platykurtic
Example: Consider again the out – of – state tuition rates for the six – school sample from Pennsylvania.
4.9 6.3 7.7 8.9 7.7 10.3 11.7
Determine the direction of skewness of the preceding data.
3
1
3
ns
XXKu
n
ii