STATISTICS AND PROBABILITY
Lecture Series 1
Edilberto G. Tripoli, MIEM2010
Statistics
- it is a branch of applied mathematics that is concerned with thecollection, analysis, and interpretation of quantitative data aswell as qualitative data and the use of probability theory toestimate population parameters.
- it is a branch of applied mathematics that is concerned with themethod of collecting, organizing, summarizing, presenting,analyzing data, drawing conclusion, and making decisions on thebasis of the collected data.
Importance of Statistics- It is used in quantitative approach utilized in engineering,
sciences, business and other activities such as pollution control,inventory and logistics planning, traffic management, behavioralanalysis, etc.
- It is used to optimized collection and processing of enormousamount of data that is too costly and sometimes becomeuseless if not properly implemented.
E. Tripoli FEU-EAC
Types of Statistics
1. Descriptive – method of organizing displaying, and describingdata using tables, graphs, charts, text, and summary measures.
2. Inferential – method that uses sample results to help makepredictions about a population.
Nature of Statistical Data
1. Nominal – are data that are numerical in name and used foridentification purpose only.
1 – Single, 2 – Married, 3 – Widowed, 4 – Separated
2. Ordinal – are ranked data or a collection that involves the order of data.
BASIC CONCEPTS IN STATISTICS
E. Tripoli FEU-EAC
Example: Moh’s scale of mineral hardness
1 – Talc 6 – Feldspar2 – Gypsum 7 – Quartz3 – Calcite 8 – Topaz4 – Fluorite 9 – Sapphire5 – Apatite 10 – Diamond
True: Diamond is the hardest mineralFalse: Diamond is twice as hard as apatite
3. Interval – data where difference between two values can be specified, calculated, and interpreted.
Example: Temperature
63°F, 68°F, 91°F, 107°F, 120°F, 131°F
True: 131°F is hotter than 107°F; 63°F is colder than 68°FFalse: 126°F is twice as hot as 63°F
E. Tripoli FEU-EAC
4. Ratio – data that starts at absolute zero that includes all themeasurement that can form quotient.Example: length, weight, volume, cost, pressure, velocity, etc.
Variable – is a symbol that can assume any prescribed set of valuesfrom a domain.
Types of Variables
1. Continuous Variable – a variable which can assume any valuebetween two given values. Represents measured data such asheights, weights, temperatures, or distances.
Example: The age of individualsTrue: 3 ½ years old, 75. 9 years old, …False: 2 year, 10 years, …
Example: weight of a bodyTrue: 50 kgs., 170 lbs.False: heavy weight, light weight, paper weight
E. Tripoli FEU-EAC
2. Discrete Variable – a variable which can assume any value exceptfractions.
Example: The number of children in a familyTrue: 1, 2, 3, …False: 2.5, 1/3, 3.75…
Example: dimension of a tableTrue: 2 x 2 x 3
False: 12 ft3
E. Tripoli FEU-EAC
Population – a set of data which consists of all possible observationof a certain phenomenon or case.
A population can be:1. Finite - for example, bolts produced by a manufacturer for a
period of time.2. Infinite – for example, possible outcomes (heads, tails) in
successive tosses of a coin.
Sample – a set of data that contains a small part of the population.
where n = sample sizeN = population sizee =margin of error
DATA PRESENTATION
E. Tripoli FEU-EAC
Methods for Data Collection
1. Survey form
2. Interview
3. Focus Group
4. Experiments
5. Registrations
Steps in Data Collection
1. Determine what type of data is needed
2. Select the appropriate data sources
3. Select the technique or method to be used for gathering data
4. Identify the key performance parameters of the target population
5. Make a data collection plan
E. Tripoli FEU-EAC
Data can be presented in the form of a,
1. Table - A set of data arranged in rows and columns
2. Graph - A drawing illustrating the relations between certainquantities plotted with reference to a set of axes
3. Chart - A visual display of information
4. Textual presentation – presentation of data in text.
Commonly used illustration for presenting data;
1. Bar graph
2. Line graph
3. Pie chart
4. Pictogram
E. Tripoli FEU-EAC
Frequency Distribution – a tabular arrangement of data by classes together with the corresponding frequencies.
Raw Data – collected data which have not been organized.
Arrays –arrangement of raw data in ascending or descending order.
Range of Data –the difference between the largest and the smallest number.
Steps in creating frequency distribution table:
1. Determine the largest and smallest number in raw data.
2. Find the range.
3. Divide the range into a convenient number of class intervals having the same sizes.
4. Count the number of observation falling into each class interval.
5. Stick counting is recommended for convenience in making tally.
E. Tripoli FEU-EAC
Example: Illustrate the data collected as shown in the table below.
Amount of Sulfur Oxides (in tons) Emitted by an Industrial Plant in 80 days
E. Tripoli FEU-EAC
RangeR = highest score – lowest score = 31.8 – 6.2 = 25.6
Herbert Sturges’ formula for number of class interval
k = 1 + 3.322 log n or N = 1 + 3.322 log 80 = 7.32 ≈ 7
Class Sizei = R/k = 25.6/7 = 3.66 ≈ 4
Frequency Distribution Table
E. Tripoli FEU-EAC
E. Tripoli FEU-EAC
E. Tripoli FEU-EAC
E. Tripoli FEU-EAC
E. Tripoli FEU-EAC
E. Tripoli FEU-EAC
E. Tripoli FEU-EAC
E. Tripoli FEU-EAC
E. Tripoli FEU-EAC
1. Determine whether in the given situation, descriptive statistics or inferential statistics have been used.
a. A bowler finds his bowling average for the past 12 games.b. A store manager predicts, based on previous years’ sales, the sales
performance of a company for the next five years.c. A teacher determines the percentage of students who passed the
examination.d. A student computes his average monthly expenditure on school
supplies for the past five months.e. A politician estimates, based on an opinion poll, his chance of winning
in the upcoming election.
EXAMPLES
2. Classify each variable as qualitative or quantitative.a. outcome in tossing a coin e. year level b. monthly salary of an employee f. hourly output of a machinec. height of trees g. speed of a car d. subjects enrolled this term h. student number
E. Tripoli FEU-EAC
The result of IQ test given to 48 students is shown in the table below.
Stud. Score Stud. Score Stud. Score Stud. Score Stud. Score Stud. Score Stud. Score Stud. Score
1 80 7 75 13 64 19 66 25 70 31 78 37 73 43 73
2 67 8 76 14 101 20 73 26 75 32 77 38 84 44 80
3 97 9 62 15 76 21 78 27 82 33 75 39 84 45 77
4 68 10 73 16 70 22 78 28 85 34 81 40 78 46 87
5 83 11 78 17 108 23 86 29 77 35 76 41 118 47 117
6 78 12 67 18 84 24 79 30 73 36 86 42 75 48 79
80 75 64 66 70 78 73 73
67 76 101 73 75 77 84 80
97 62 76 78 82 75 84 77
68 73 70 78 85 81 78 87
83 78 108 86 77 76 118 117
78 67 84 79 73 86 75 79
62 70 73 76 78 79 84 87
64 70 75 76 78 80 84 97
66 73 75 77 78 80 84 101
67 73 75 77 78 81 85 108
67 73 75 77 78 82 86 117
68 73 76 78 79 83 86 118
E. Tripoli FEU-EAC
Range R = 56
No. of Class Interval k = 6.6
Size of Class Interval i = 8.5
FREQUENCY DISTRIBUTION TABLE< cf using lb
Interval f x rf lb ub < 59.5 0 < cf
60 68 6 64 12.50 59.5 68.5 < 68.5 6 > 59.5 48
69 77 17 73 35.42 68.5 77.5 < 77.5 23 > 68.5 42
78 86 19 82 39.58 77.5 86.5 < 86.5 42 > 77.5 25
87 95 1 91 2.08 86.5 95.5 < 95.5 43 > 86.5 6
96 104 2 100 4.17 95.5 104.5 < 104.5 45 > 95.5 5
105 113 1 109 2.08 104.5 113.5 < 113.5 46 > 104.5 3
114 122 2 118 4.17 113.5 122.5 < 122.5 48 > 113.5 2
48 100.00 using ub > 122.5 0E. Tripoli FEU-EAC
6
17
19
12
12
60 - 68 69 - 77 78 - 86 87 - 95 96 - 104 105 - 113 114 - 122
CLASS INTERVAL
BAR GRAPH
0
2
4
6
8
10
12
14
16
18
20
60 - 68 69 - 77 78 - 86 87 - 95 96 - 104 105 - 113 114 - 122
CLASS INTERVAL
HISTOGRAM
E. Tripoli FEU-EAC
02468
101214161820
55 64 73 82 91 100 109 118 127
CLASS MARK
FREQUENCY POLYGON
60 - 6813%
69 - 7735%
78 - 8640%
87 - 952%
96 - 1044%
105 - 1132%
114 - 1224%
PIE GRAPH
E. Tripoli FEU-EAC
OGIVE
0
10
20
30
40
50
60
59.5 68.5 77.5 86.5 95.5 104.5 113.5 122.5
LOWER BOUNDARY OF CLASS INTERVAL
GREATER THAN - CUMULATIVE FREQUENCY
0
10
20
30
40
50
60
59.5 68.5 77.5 86.5 95.5 104.5 113.5 122.5
UPPER BOUNDARY OF CLASS INTERVAL
LESS THAN - CUMULATIVE FREQUENCY
E. Tripoli FEU-EAC
SUMMARY OF TERMS
1. Qualitative or categorical FDT is an FDT where the data are grouped. according to some qualitative characteristics or into non-numerical categories
2. Quantitative FDT is an FDT where data are grouped according to some numerical or quantitative characteristics.
3. Class Boundaries- average of the lower class limit of the class and the upper limit of the previous class
4. Class Marks (xi)– midpoint of the class interval where the observations tend to cluster about
5. Relative Frequency – the proportion of observations falling in a class and is expressed in percentage.
6. Cumulative Frequencya. less than cumulative frequency (<cf) – total number of observations
less than the upper boundary of a class intervalb. greater than cumulative frequency (>cf) – total number of
observations greater than the lower boundary of a class interval
E. Tripoli FEU-EAC
Top Related