Statistics Chapter 2 Organizing Data

77
Statistics Chapter 2 Organizing Data

description

Statistics Chapter 2 Organizing Data. Quick Talk. Think of a situation where you need to organize data? (any kind of data) What can you do after you collected the data and organized it?. Answer. You can graph it, calculate the range, midpoint, find a frequency then analyze the data. - PowerPoint PPT Presentation

Transcript of Statistics Chapter 2 Organizing Data

Chapter 2 Organizing Data

Statistics Chapter 2 Organizing DataQuick TalkThink of a situation where you need to organize data? (any kind of data)

What can you do after you collected the data and organized it?AnswerYou can graph it, calculate the range, midpoint, find a frequency then analyze the data.

Please Draw this frequency table in your notebook,we will be filling it outFrequency TableA frequency table partitions data into intervals and shows how many data values are in each interval. The intervals are constructed so that each data value falls into exactly one interval.

Note: intervals are known as classes. The book uses the word classes, but I use intervals because it makes more sense.How do you create a frequency table?Consider this situation: You are collecting how many minutes each student study for a particular class. You interviewed 50 students and here is the chart.

15471053020184031124518415101925133416177161713101735762527183141248146214137154612918How do you create a frequency table?1) Determine how many intervals you want.between 5-15 is usually preferredAnything less than 5, you risk losing informationAnything more than 15, data might not be sufficiently analyzedLets use 6 intervals for this case. (remember you can any number between 5 and 15)With this, you can find the width of each interval.Finding the width of the intervalThe lower interval limit is the lowest data value that can fit in an interval.

The upper interval limit is the highest data value that can fit in an interval.

The interval width is the difference between the lower class limit of one interval and the lower class limit of the next interval.In our case, our lowest number is 1, so 1+8=9, therefore, 9 would be the start of the next interval (remember we will have 6 intervals total)

ActivityFind the starting number of each interval

AnswerStart of 1st interval=1Start of 2nd interval=9Start of 3rd interval=17Start of 4th interval=25Start of 5th interval=33Start of 6th interval=41Start of 7th interval=49Therefore, the interval limitLower interval limitUpper interval limit189161724253233404148Now tally all the numbers that fall in each intervalActivityNow tally up all the numbers that fall in each interval. Find the frequency also

AnswerMidpoint (within the interval)ActivityFind the midpoint of each intervalAnswerLower interval limitUpper interval limitInterval midpoint184.591612.5172420.5253228.5334036.5414844.5Finding interval boundaryUpper interval boundaries, add 0.5 to the upper interval limit.

Lower interval boundaries, subtract 0.5 from the lower interval limits.ActivityFind the interval boundaries for all interval.AnswerLower interval limitUpper interval limitLower interval boundaryUpper interval boundary180.58.59168.516.5172416.524.5253224.532.5334032.540.5414840.548.5Relative FrequencyActivityFind the relative frequency of each intervalReviewHow to create frequency table.Determine how many intervals you wantFind interval widthDetermine the lower/upper interval limit for each intervalDetermine the lower/upper interval boundaries for each intervalDo the tally and find the frequency (they are the same number)Find the midpointFind the relative frequency

Group activity: Now try to do this by yourself or with a partner.

This is a data represent glucose blood level after 12 hour fast for a random sample of 70 women. Use 6 intervals (classes)

45668371766459597682808185778290877279698371876981769683679410194899473999385838078808583847481706589708084776546807075451017110973738072816374AnswerHomework practicePg 46-47 #1-4 all, 5-10 (only do frequency table) (Will start in class if time permits)

Before we talk about how to graph a histogram, lets talk about different shapes of a distributionDifferent distribution shapes

Distribution definitionsMound-shaped symmetrical: the term refers to a histogram in which both sides are the same when the graph is folded vertically down the middle. (Normal curve)Uniform or rectangular: These terms refer to a histogram in which every interval has equal frequency. From one point of view, a uniform distribution is symmetrical with added property that the bars are of the same heightSkewed left or skewed right: These terms refer to a histogram in which one tail is stretch out longer than the other. Bimodal: This term refers to a histogram in which the two classes with the largest frequencies are separated by at least one interval. The top two frequencies may have slightly different values.Graphing a histogramYou use the frequency table to graph a histogram (use the example we did together in class about study minutes with 50 students)

You use lower/upper interval boundaries for the x axis because you dont want any gaps.Lets graph both frequency histogram and relative-frequency histogram

This is how a frequency histogram looks likeThis is how relative frequency histogram looks likeActivityCompare the two graphs. What do you guys notice? What can you say about the distribution of data?Quick talkIf we were to construct a normal distribution curve or mound-shaped symmetrical histogram for IQ, Newton and Einstein would be considered an outlier. What do you guys think outlier mean?What is outlier?Outliers are data values that are very different from other measurements in the data set.

Two types:

orCumulative FrequencyCumulative Frequency for an interval is the sum of the frequencies for that interval and all the previous intervals. Example: Lets take a look at the class example again.Ogive GraphOgive is a graph that displays cumulative frequenciesOgive graph of the exampleSo then what does this graph tell us?Example: I can say that 31 students had studied no more than 16 minutes, because it is cumulative.ActivityFind the cumulative frequency and do an ogive graphAnswer Ogive graphQuick TalkWhat can you conclude about 88 minute? Homework PracticePg 46-47 #6-10 (do cumulative frequency and draw ogive graph) (Will start in class if time permits)

Are there other types of graphs?Yes! There are bar graphs, circle graphs, and Time-Series Graphs

Bar GraphBars can be vertical or horizontal.Bars are of uniform width and uniformly spaced.The lengths of the bars represent values of the variable being displayed, the frequency of occurrence, or the percentage of occurrence. The same measurement scale is used for the length of each bar.The graph is well annotated with title, lables of each bar, and vertical scale or actual value for the length of each bar.Examples of bar graphsNote:Look at the number where y-axis started.

You might see the graph with squiggle on the changed axis. Sometimes, if a single bar is unusually long, the bar length is compressed with a squiggle in the bar itself. (look at pg 51 example 2-11b with the graph)Another example of bar graphActivityUse the info below to create a bar graph.

Average annual income (in thousands) of a household headed by a person with the stated education level is as follows: 16.1 for highschool, 34.1 for highschool graduates, 48.6 for associated degrees, 62.1 for bachelors degrees, 71.0 for masters degrees and 84.1 for doctoral degrees

What can you conclude?Pareto ChartPareto chart is a bar graph in which the bar height represents frequency of an event. In addition, the bars are arranged from left to right according to decreasing height.Example of Pareto chartConsider this situation:Causes for lack of sleep(two month study 61 days)CauseFrequencyPlaying x-box or ps314Texting9Watching movie/TV5Talking on the phone10Doing homework/project20Other3

Pareto ChartActivityUse the info below to create a pareto chart.

Here are a list of the most common stolen items per 100000 cases: 10.1 electronics; 15.6 jewelries; 7.3 cars; 20.4 cash; 26.7 identity

What can you conclude?Circle graph or Pie chartCircle graph or pie chart, wedges of a circle visually display proportional parts of the total population that share a common characteristic.Example of Circle graph or pie chartConsider the situation:Monthly Financial Budget (based on $4000 monthly)

CategoriesAmount spentFractionPercentageDegree of the pieFood800800/40000.2.2*360=72Investment500500/40000.125.125*360=45

Bills/debt17501750/4000.4375.4375*360=157.5

Rent950950/4000.2375.2375*360=85.5

Circle Graph or Pie ChartQuick TalkIs the chart consistent with our data?ActivityCreate a circle graph with the following info:

Gamestop took a survey on the first 500 customers to see what genre of games they bought. 70 Fighting, 123 shooter, 150 action-adventure,53 role-playing, 12 strategy, 92 others.

What can you conclude?Time-Series GraphTime-series graph, data are plotted in order of occurrence at regular intervals over a period of timeExample of Time-series graphConsider this situation:Points Scored in a game (49er 2012)Week123456789Points3027133445313240Week1011121314151617Points2432311327411327Time Series GraphWhat can you conclude about the graph?Is there a pattern? Is there anything you can conclude?ActivityCreate a time-series graph from the following dataWhat can you conclude?

Week123456789Distance1.51.41.71.61.92.01.82.01.9Week101112131415161718Distance2.02.12.12.32.32.22.42.52.6Determine Which Type of Graph to UseBar graphs are useful for quantitative or qualitative data. Pareto Charts identify the frequency of events or categories in decreasing order of frequency of occurrence.Circle graph display how a total is dispersed into several categories.Time-series graph display how data change over time.Note: Make sure you provide title, label axes and identify units of measure in all type of graphs!!

TechnologyYou can create bar graphs, pareto charts, circle graphs, time-series graph in powerpoint or words. You first open up the powerpoint or words.On the top, you press insert, and click on charts.Choose the chart you want and input data.TI-83/TI-84. You can create time-series. Place consecutive values 1 through the number of time segments in list L1 and corresponding data in L2. Press Stat Plot and highlight an xy line plot (will try in class)Homework PracticePg 55-57 #1-12 (Will start in class if time permits)Stem-and-leaf displayStem-and-leaf is a method of exploratory data analysis that is used to rank-order and arrange data into groups.Why do we use stem-and-leaf instead of histogram?Similarity: Both display frequency distributions

Difference: In histogram, we lose most of the specific data values (because of intervals). Stem-and-leaf display is a device that organizes and groups data but allow us to recover the original data if desired.Stem-and-leaf exampleWrite out all the numbers

ActivityPut this chart into a stem-and-leaf display302712423547383627352217293210383241332645184318323132192133312829511232182126Homework PracticePg 63-65 #1-9 even

Group Project (2 in a group)Situation: You are to conduct a short survey or poll (school appropriate and you have to interview at least 50 students), and represent your survey in a graph that we have learned. You are then to type a short 1 page report on the following:

What is the variable?What method did you conduct your survey?What are the advantages and disadvantages of your method collection?Are there potential bias?How did you try to create randomness?What is your sample size (how many people total)?What kind of sampling did you use?Explain and label your graphWhat conclusion can you make about your result?Can you use your result and apply to the entire population? Why or why not?

Sample survey/polling topics: movies, music, war, politics, clothing, pets, celebrities