Statistical Analysis of a sample of exam grades

download Statistical Analysis of a sample of exam grades

of 14

description

Contents:Quantitative variable: central tendency measures.Quantitative variable: measures of variabilityQualitative variable: frequenciesGrades distributionGrouped data-central tendency indicatorsGrouped data-central tendency indicatorsSamplingThe analysis of variation between two intervals

Transcript of Statistical Analysis of a sample of exam grades

Statistical analysis of a sample of exam grades

Table of ContentsIntroduction3Quantitative variable: central tendency measures.4Quantitative variable: measures of variability5Qualitative variable: frequencies7Grades distribution8Grouped data-central tendency indicators9Grouped data-variation and asymmetry9Sampling10The analysis of variation between two intervals11Conclusion12Refferences:13

IntroductionThe purpose of this report is to analyze a sample of data representing the grades from the final exam of some students from The Bucharest Academy of Economic Studies. All data was collected from the web page of the university. The present report will mainly focus of presenting and explaining the central tendency and variability measures. The following data was collected:Tabel 1: Raw dataSource: www.ase.roNr. Crt. GenderGrade(xi)102F9,25

67F9,48103F9,35

68F8,18104F9,10

69F9,52105F8,91

70F9,52106F8,73

71F9,25107F9,20

72M8,69108F9,52

73F9,03109F8,91

74M8,66110M9,19

75F9,25111F9,55

76F8,78112M9,38

77F8,44113F9,08

78F9,42114F9,75

79F8,40115F9,11

80M8,79116F9,45

81M8,88

82F9,16

83F9,29

84F8,74

85F7,69

86M8,25

87F8,50

88F9,20

89F9,53

90F9,42

91F9,16

92F9,58

93F8,35

94F9,46

95F8,57

96M9,05

97F8,39

98F9,48

99F9,28

100F8,97

101F9,58

Quantitative variable: central tendency measures.Firstly, the grades of the students were considered. After noticing that there are some differences among them, there is the need to find out what the general level is and where the majority lies. The arithmetic mean was the first one to be computed and it showed that the average grade of these fifty students was 9.05. = 9.05 pointsAfterwards, it was proceeded by computing also the harmonic, quadratic and geometric means, all of which having similar values. = 9.03 points == 9.06 points 9.04 points Furthermore, it was computed the median which equals 9.16, representing the fact that half of the students from the sample have grades below it, while half of them have grades above it. Because the data series that was studied contains an even number of observations (50), the median is the average of the two middle observations: 9.16 pointsAfter that, it was continued with the mode. It is 9.25 points and it translates into the fact that this value is the most frequent among these students. This value occurs three times, which is more than the repetition of any other value. Tabel 2: Central tendency calculusSource: www.ase.roGrade(xi)1/xixi^2

7,690,130059,14

8,180,122266,91

8,250,121268,06

8,350,119869,72

8,390,119270,39

8,400,119070,56

8,440,118571,23

8,500,117672,25

8,570,116773,44

8,660,115575,00

8,690,115175,52

8,730,114576,21

8,740,114476,39

8,780,113977,09

8,790,113877,26

8,880,112678,85

8,910,112279,39

8,910,112279,39

8,970,111580,46

9,030,110781,54

9,050,110581,90

9,080,110182,45

9,100,109982,81

9,110,109882,99

9,160,109283,91

9,160,109283,91

9,190,108884,46

9,200,108784,64

9,200,108784,64

9,250,108185,56

9,250,108185,56

9,250,108185,56

9,280,107886,12

9,290,107686,30

9,350,107087,42

9,380,106687,98

9,420,106288,74

9,420,106288,74

9,450,105889,30

9,460,105789,49

9,480,105589,87

9,480,105589,87

9,520,105090,63

9,520,105090,63

9,520,105090,63

9,530,104990,82

9,550,104791,20

9,580,104491,78

9,580,104491,78

9,750,102695,06

452,425,53984103,56

Quantitative variable: measures of variabilitySecondly, it was continued with the computation of the range. The absolute one is 2.06, showing the difference between the biggest and the smallest grade, while the relative one is 23% and it proves that the data series is very homogeneous, so the average is representative. points

Moreover, the dispersion of all the observations is measured by calculating the variance which has the value of 0.20. The corresponding standard deviation equals 0.44. What it means is that on average, the grades of the students differ (in plus or in minus) by 0.44 points from the average of 9.05. In addition, it was also computed the coefficient of variation that is of 5%, underlining that the data is very homogenous, having a representative average.

points

Tabel 3: Dispersion calculusData source: www.ase.roGrade(xi)xi-average(xi-av)^2

7,69-1,361,85

8,18-0,870,75

8,25-0,800,64

8,35-0,700,49

8,39-0,660,43

8,40-0,650,42

8,44-0,610,37

8,50-0,550,30

8,57-0,480,23

8,66-0,390,15

8,69-0,360,13

8,73-0,320,10

8,74-0,310,10

8,78-0,270,07

8,79-0,260,07

8,88-0,170,03

8,91-0,140,02

8,91-0,140,02

8,97-0,080,01

9,03-0,020,00

9,050,000,00

9,080,030,00

9,100,050,00

9,110,060,00

9,160,110,01

9,160,110,01

9,190,140,02

9,200,150,02

9,200,150,02

9,250,200,04

9,250,200,04

9,250,200,04

9,280,230,05

9,290,240,06

9,350,300,09

9,380,330,11

9,420,370,14

9,420,370,14

9,450,400,16

9,460,410,17

9,480,430,19

9,480,430,19

9,520,470,22

9,520,470,22

9,520,470,22

9,530,480,23

9,550,500,25

9,580,530,28

9,580,530,28

9,750,700,49

452,429,89

Qualitative variable: frequenciesApart from the grades, the sample took into account one more characteristic: the gender. The calculus revealed that from all fifty students, only eight are males, representing 16%. In the same time, in our randomly selected sample, there are forty-two females corresponding to 84% of the total number of observations.

Tabel 4: Gender distributionData source: www.ase.roAbsolute frequencyTotal

Male850

Female42

Relative frequency (%)100%

Male16%

Female84%

Figur 1Data source: www.ase.roGrades distributionTo analyze more accurately the positioning of the students in terms of their grades, the results were divided into 5 intervals (classes). The width of each class should have been, but it was rounded to 0.5, in order to deliver a more expressive interpretation of the results. The inference is that one student was between 7.50 and 8.00, six were between 8.00 and 8.50, twelve were between 8.50 and 9.00, twenty-three were between 9.00 and 9.50 and eight of them had grades higher than 9.50 points. The results are presented in the following histogram.

Figur 2Data source: www.ase.ro

Grouped data-central tendency indicatorsOn the data presented in this way, the results are more representative because they are weighted. In order to be able to work with these intervals, firstly the class midpoint had to be computed. The class midpoints equal the value halfway between the upper limit and lower limit of each class:

The average grade of the students from this sample is 9.06: 9.06 pointsThe median equals 9.14 and this means that half of the grades are below 8.61, and the other half are above this value. 25.5 The median interval is the interval for which the cumulated frequencies are equal or larger than the median location. For this sample, the median interval is [9.00-9.50], since 1+6+12+2325.5 . 9.14 pointsThe mode equals 9.21, so this is the most common grade in this sample: The modal interval is the interval with the highest frequency: [9.00-9.50]. 9.21 pointsTabel 5: Grouped data-central tendency calculusData source: www.ase.roGrade interval (lower limit included)No of students(ni)Class midpoint(xi)xi*ni

7,50-8,0017,757,75

8,00-8,5068,2549,5

8,50-9,00128,75105

9,00-9,50239,25212,75

9,50-10,0089,7578

Total50453

Grouped data-variation and asymmetry The computed variance for these data is 0.23 and the standard deviation is 0.48. The explanation of it is that in average, the grades obtained by the students differ (in plus or in minus) by 0.48 from the average of 9.06. The resulted coefficient of variation is 5%, which is below 35%, showing that the data series is homogenous and the average is really representative. 0.23 0.48 points 5%The aforementioned results were computed based on the calculations presented in the following table:Tabel 6: The distribution of students-dispersion and skewness calculusData source: www.ase.roGrade interval (lower limit included)No of students (ni)Class midpoint (xi)xi-av(xi-av)^2(xi-av)^2 *ni

7,50-8,0017,75-1,311,721,72

8,00-8,5068,25-0,810,663,94

8,50-9,00128,75-0,310,11,15

9,00-9,50239,250,190,040,83

9,50-10,0089,750,690,483,81

Total5011,45

Furthermore, the skewness was also analyzed in order to describe the shape of the data series. The indicator of absolute skewness is equal to -0.15, meaning that the distribution is negatively skewed (skewed to the right), so the mode is on the right side of the mean. In the same time, the coefficient of skewness equals -0.32 which is a little higher than 0.3, and this translates into the fact that the asymmetry of the data series is not really moderate. 9.06 - 9.21 = -0.15 -0.32SamplingThe analyzed raw data is a representative sample which was randomly chosen from the first year students of ASE. In order to estimate the population mean, the marginal error, the lower confidence limit and the upper confidence limit were computed. All calculus were done based on a 95% confidence level, with a corresponding significance level of =5% and a standard error of z=1.96 . The population from which we are sampling is infinite due to the fact that the population size N (all first year students from ASE) is larger than 20 times the sample size (N>20*50). Because of that, the standard error of the mean is: 0.07. The lower confidence limit: = 9.06 1.960.07 = 8.93 The upper confidence limit: = 9.06 + 1.960.07 = 9.19 Hence, the confidence interval is: .The conclusion is that the average grade of all the first year students would be comprised between 8.93 points and 9.19 points, with a 95% confidence level. The analysis of variation between two intervalsWhen looking separately at the two categories (males and females) the results were different from the aggregate ones. Tabel 7: Variance calculus-MALESData source: www.ase.roGrade interval (lower limit included)Males (ni1)Class midpoint (yi1)yi1*ni1(yi1-av1)^2 *ni1

7,50-8,0007,7500,00

8,00-8,5018,258,250,39

8,50-9,0048,75350,06

9,00-9,5039,2527,750,42

9,50-10,0009,7500,00

Total8710,88

8.88 points 0.11 0.33 points 4%Tabel 8: Variance calculus-FEMALESData source: www.ase.roGrade interval (lower limit included)Female (ni2)Class midpoint (yi2)yi2*ni2(yi2-av2)^2 *ni2

7,50-8,0017,757,751,81

8,00-8,5058,2541,253,57

8,50-9,0088,75700,95

9,00-9,50209,251850,48

9,50-10,0089,75783,43

Total4238210,24

9.10 points 0.24 0.49 points 5%When comparing the two set of results, it can be noticed that, on average, the grades of the females are higher than the grades of the males, by 0.22 points. In addition, both the standard deviation and the coefficient of variation have higher values, showing that the female sample presents an increased variability. The value of the explained variation is 0.01, measuring the deviation of the group mean (the gender) from the overall mean. The residual dispersion (within groups) equals 0.22, showing the importance of other factors besides the gender in determining the level of the grades. In the end, the coefficient of determination equaled 3%, meaning that only 3% from the total variation in the level of grades is explained by gender. In conclusion, the score received at the exam does not depend on the student`s gender. 0.01 0.22 Verifying the rule of dispersion: 0.23 = 0.01 + 0.22 => The rule is verified. 3%ConclusionAfter analyzing the grades of the fifty students from the randomly chosen sample, it can be noticed that the central tendency measures have values a little above 9, with an average of 9.05. This means the overall level is pretty high. The range of the grades is 2.06, explaining a homogenous set of data. The majority of the students (84%) are females, who have higher grades, on average, but they also present more variability. The majority of the grade lies in the [9.00-9.50] interval, with 23 students out of 50 belonging to this class. On average, after computing the confidence interval, it resulted that all the first year students from ASE have grades higher than 8.93 and lower than 9.19. In the end, it was proven than the variation in the level of grades does not depend on the gender.

References:

1. www.ase.ro

2