Study vs. Experiment Observational Study Based on data in which no manipulation of factors has been...

55
Data Collection

Transcript of Study vs. Experiment Observational Study Based on data in which no manipulation of factors has been...

  • Slide 1
  • Slide 2
  • Study vs. Experiment Observational Study Based on data in which no manipulation of factors has been employed Experiment Manipulates factors to create treatments Randomly assigns subjects to the treatments Compare the responses of the subjects across treatment levels
  • Slide 3
  • Study or Experiment? Researchers have linked an increase in the incidence of breast cancer in Italy to dioxin released by an industrial accident in 1976. The study identified 981 women who lived near the site of the accident and were under age 40 at the time. Fifteen of the women had developed breast cancer at an unusually young average age of 45. Medical records showed that they had heightened concentrations of dioxin in their blood and that each tenfold increase in dioxin level was associated with a doubling of the risk of breast cancer. Observational study
  • Slide 4
  • Study or Experiment? Is diet or exercise effective in combating insomnia? Some believe that cutting out desserts can help alleviate the problem, while others recommend exercise. Forty volunteers suffering from insomnia agreed to participate in a month-long test. Half were randomly assigned to a special no-desserts diet; the others continued desserts as usual. Half of the people in each of these groups were randomly assigned to an exercise program, while the others did not exercise. Those who ate no desserts and engaged in exercise showed the most improvement. Experiment
  • Slide 5
  • The Cycle of Statistics Population Sample Statistic Parameter
  • Slide 6
  • Principles of Experimental Design Control aspects of the experiment that we know may have an effect on the response, but that are not the factors being studied. Randomize to even out effects that we cannot control Replicate over as many subjects as possible.
  • Slide 7
  • Types of Sampling Random Sample Simple Random Sample Stratified Random Sample Probability Random Sample
  • Slide 8
  • Random Sampling Simple Random Sample (SRS) Every member of the population has an equal chance of being chosen for the sample Method Assign a random number to each individual in the sampling frame Select only those whose random numbers satisfy some rule
  • Slide 9
  • Simple Random Sample Example There are 80 students enrolled in an introductory Statistics course; you are to select a sample of 5 Sampling frame The roster of all students enrolled in the course Label each student 01 - 80 Use a random number generator and choose the first 5 students from the list that match the random numbers. Ignore numbers not on the list and repeats.
  • Slide 10
  • Stratified Random Sample Population is divided into similar groups of individuals These are called strata Then a SRS is completed in each strata These are combined for the overall sample
  • Slide 11
  • Probability Random Sample A sample is chosen by chance Each sample has a probability of being chosen We have to know this
  • Slide 12
  • What is the population? What is the sample? Which random sample was used? A company packaging snack foods maintains quality control by randomly selecting 10 cases from each days production and weighting the bags. Then they open on a bag from each case and inspect the contents. Population: All snack foods produced at the company Sample: 10 cases from each days production Random Sample: SRS
  • Slide 13
  • What is the population? What is the sample? Which random sample was used? Dairy inspectors visit farms unannounced and take samples of the milk to test for contamination. If the milk is found to contain dirt, antibiotics, or other foreign matter, the milk will be destroyed and the farm re-inspected until purity is restored. Population: All milk at the dairy (in the tank) Sample: sample from the milk tank Random Sample: SRS
  • Slide 14
  • Terminology Experimental Units:individuals on which the experiment is done Subjects:Human experimental units Treatment:specific experimental condition applied Control group:Group that receives no treatment or a placebo Placebo:A treatment known to have no effect Factor:What is being manipulated Response:What is being measured
  • Slide 15
  • Analyzing Experiments Aspirin Study Replication: Control: Treatment : Blinding: Randomization: Subjects/Units:1000 male volunteers Aspirn Patients not know which pill they are taking A group will take a placebo pill The men will be randomly assigned to either the treatment group or placebo group. Each treatment will be replicated 500 times Factor:AspirinResponse:Number of heart attacks Levels:Low dose and none (Placebo)
  • Slide 16
  • Slide 17
  • Categorical: places an individual into one of several groups or categories. Ex: Eye color, favorite food Quantitative : takes a range of numeric values Ex: Height, weight, income Types of Variables Discrete: finite possible values Continuous: infinite possible values EX: number of goals in soccer EX: Height of males at Enloe
  • Slide 18
  • Gender Telephone area code Amount of electricity used Zip code Ticket sales at Mylie Cyrus concert Number of chicken eggs hatched on Nov. 17, 2006 at 3:00 am What kind of variable? Categorical Quantitative (C) Categorical Quantitative (D) Does it make sense to average the values?
  • Slide 19
  • Have a title Axes labeled Units identified Legend For categorical data Every graph I ever make will always
  • Slide 20
  • Bar Chart Graphs for categorical data Bars never touch!
  • Slide 21
  • Pie Chart Graphs for categorical data *Used for comparing parts to a whole
  • Slide 22
  • Create a graph for the following A survey was conducted of 1000 individuals regarding their favorite color. The results are as follows: Red 367 Yellow 100 Green 68 Blue 159 Purple 200 Grey 26 Pink 80
  • Slide 23
  • Data Representation of Favorite color survey
  • Slide 24
  • Dot Plot Useful for small sets of data Stem and Leaf Plot Useful for small sets of data More information than dot plot Histograms Box Plots More about these tomorrow! Graphs for Quantitative Variables
  • Slide 25
  • Sort the data Identify the min and max values to establish what kind of stems and leaves to use If leaves become too long split them Create a legend Creating a Stem and Leaf plot
  • Slide 26
  • Back to back stem and leaf plots Used for comparing two similar sets of data Stems are in the middle and the leaves expand to the left for one data set and to the right for the other data set
  • Slide 27
  • Histograms Groups nearby values and displays frequencies National SAT scores 2007
  • Slide 28
  • How to construct Histograms Determine the bin size Divide the range into equal sections Min of 5 bins Create a frequency table Draw the graph
  • Slide 29
  • Wake County 2008 SAT scores 1633159016071622130413941324 1766151414121680154413781662 153116461604147215681541 1.Sort the data 2.Identify the range of the data 3.Identify a bin size that makes sense and will produce at least 5 bins
  • Slide 30
  • Relative Frequency Table ScoreCount 1300 - 1399 1400 - 1499 1500 1599 1600 1699 1700 1799 Use this table to help draw your histogram!
  • Slide 31
  • Slide 32
  • Graphs can be MISLEADING! Number of deaths in Iraq as Published by AOL news in March of 2006
  • Slide 33
  • Slide 34
  • Slide 35
  • Slide 36
  • Shape Mound, symmetrical, skewed, single peak, multiple peaks Outlier Any observation that appears to not belong with the others Center The middle of the data Spread Min value to max value (including or excluding outliers) Describing Data
  • Slide 37
  • Describing Graphs (Shape) Symmetric: If the right and left sides of the histogram are approximately mirror images Skewed right: If the right side has outliers Skewed left: If the left side has outliers Bi-modal: If there are 2 peaks Uniform: There are the same number of observations for each value
  • Slide 38
  • Measures of center Median Exact middle of a set of data Mean Arithmetic average of all of the observations in a data set
  • Slide 39
  • Ex: 1,2,3,4,5,6,7,8,9 What is the median? 5 What is the mean? What if 10 is added to the data set? What is the median and mean?
  • Slide 40
  • Resistant measures Def: A measure is resistant if it is not easily influenced by extreme observations Is the median a resistant measure? Yes Is the mean a resistant measure? No
  • Slide 41
  • Measures of spread Standard deviation Find this in your calculator under 1 variable stats! Quartiles IQR (Inter Quartile Range) Q3-Q1 These are found in your 5 # Summary! Range =Max-min
  • Slide 42
  • 5 number summary Min Q1: quartile 1, median of the lower half Median (Q2) Q3: quartile 3, median of the upper half Max
  • Slide 43
  • Components of a box plot 5 number summary Min, Q1, Median, Q3, Max Outliers Q1-1.5(IQR) Q3+1.5(IQR)
  • Slide 44
  • MinQ1Med Q3Max
  • Slide 45
  • 25% 50% 25% Wheres the data?
  • Slide 46
  • What about outliers? Q1MedQ3Q3 MaxMi n Smallest obs. That is not an outlier Largest obs. That is not an outlier
  • Slide 47
  • Slide 48
  • Standard deviation Gives a measure of how far the data varies from the mean on average Is only used if the mean is the chosen measure of center Is the standard deviation a resistant measure? No!
  • Slide 49
  • Beginning pulse in class (n=23) 5050556062667272 7676767980808182 85879196108110110 Min = 50 Max = 110Median = 79Q1 = 66Q3 = 87
  • Slide 50
  • End pulse in class (n=24) 545558586364 656768686970 707072747676 7980808587109 Min = 54 Max = 109Median = 70Q1 = 64.5Q3 = 77.5
  • Slide 51
  • Outliers Interquartile range (IQR) = Q3 Q1 An observation is an outlier if it lies 1.5(IQR) above Q3 or 1.5(IQR) below Q1 End Class Data Q1 = 64.5 Q3 = 77.5 IQR = 77.5 64.5 = 13 1.5(13) = 19.5 Q1 1.5(IQR) = 64.5 19.5 = 45 Q3 + 1.5(IQR) = 77.5 + 19.5 = 97
  • Slide 52
  • Outliers Any observation below 45 or above 97 will be an outlier 109 is an outlier 54555858636465 67686869707070 72747676798080 8587109
  • Slide 53
  • What is Normal? A bell shaped curve Standard Normal distribution is when Mean=0 Standard Deviation=1
  • Slide 54
  • 68-95-99.7 Rule The normal curve can give us an idea of how extreme a value is based on how far away from the mean it is. 68% 95% 99.7% mean 0 Standard Deviations 21 3 -2 -3
  • Slide 55
  • Homework P. 65 # 12, 13 Make graph (box plot for #12 and histogram for #13) Describe the shape Find any outliers Find mean, and median Find range, standard deviation and IQR