MATH 2311 Introduction to Probability & Statistics ...

32
MATH 2311 Introduction to Probability & Statistics Introduction Cathy Poliak, Ph.D. [email protected] Department of Mathematics University of Houston August 23, 2016 Cathy Poliak, Ph.D. [email protected] (Department of Mathematics University of Houston ) First Class August 23, 2016 1 / 32

Transcript of MATH 2311 Introduction to Probability & Statistics ...

Page 1: MATH 2311 Introduction to Probability & Statistics ...

MATH 2311Introduction to Probability & Statistics

Introduction

Cathy Poliak, [email protected]

Department of MathematicsUniversity of Houston

August 23, 2016

Cathy Poliak, Ph.D. [email protected] (Department of Mathematics University of Houston )First Class August 23, 2016 1 / 32

Page 2: MATH 2311 Introduction to Probability & Statistics ...

Outline

1 Course Information and Introduction

2 Types of Data: Section 1.1

3 Types of Variables

4 Describing Data By Graphs

Cathy Poliak, Ph.D. [email protected] (Department of Mathematics University of Houston )First Class August 23, 2016 2 / 32

Page 3: MATH 2311 Introduction to Probability & Statistics ...

Course Information

Instructor: Dr. Cathy Poliak

Webpage: http://www.math.uh.edu/~cathy/

Course webpage: https://www.casa.uh.edu

Office: Fleming 11C

Office Hours: Tuesdays and Thursdays 4:00 - 6:30 pm; otherwiseby appointment with 24 hours advance notice

Email: [email protected]

Cathy Poliak, Ph.D. [email protected] (Department of Mathematics University of Houston )First Class August 23, 2016 3 / 32

Page 4: MATH 2311 Introduction to Probability & Statistics ...

Relevance of statistics

Statistics is used to gather and analyze data for any discipline.(This is Statistics: http://thisisstatistics.org)

Statistics is used to analyze surveyshttp://www.gallup.com/home.aspx

Cathy Poliak, Ph.D. [email protected] (Department of Mathematics University of Houston )First Class August 23, 2016 4 / 32

Page 5: MATH 2311 Introduction to Probability & Statistics ...

What is Statistics?

Statistics is used to make intelligent decisions in a world full ofuncertainty. "A knowledge of statistics provides the necessary toolto differentiate between sound statistical conclusions andquestionable conclusions." (Business Statistics Communicatingwith Numbers, Jaggia and Kelly, 2013, pg 4)

Statistics is the science of collecting, organizing, and interpretingnumerical facts which we call data.

Cathy Poliak, Ph.D. [email protected] (Department of Mathematics University of Houston )First Class August 23, 2016 5 / 32

Page 6: MATH 2311 Introduction to Probability & Statistics ...

A young fellow from

had committed a grievous crime. Cathy Poliak, Ph.D. [email protected] (Department of Mathematics University of Houston )First Class August 23, 2016 6 / 32

Page 7: MATH 2311 Introduction to Probability & Statistics ...

He had murdered his

In Texas there is no excuse for murdering a horse. If the jury finds you guilty there is only one punishment. Hanging.

Cathy Poliak, Ph.D. [email protected] (Department of Mathematics University of Houston )First Class August 23, 2016 7 / 32

Page 8: MATH 2311 Introduction to Probability & Statistics ...

On the day of his hanging the warden talked to the prisoner and said, “I am one of the few wardens that follows the law of 1889, and the law requires that I gather a random sample of 100 Texans.”

The warden dragged the prisoner to the auditorium and sure enough there were 100 Texans sitting there. The warden said, “ The law requires that I now give you an hour to speak to these people.”

The prisoner said, “I have nothing to say.”

Cathy Poliak, Ph.D. [email protected] (Department of Mathematics University of Houston )First Class August 23, 2016 8 / 32

Page 9: MATH 2311 Introduction to Probability & Statistics ...

Whereupon a professor in the audience stood up and said, “ I don’t think the people of Texas know enough about statistics. If you are not going to use the hour, would you mind if I use it to educate these people about statistics?”

The prisoner said, “ No, go ahead.”

But the prisoner turned to the warden and said,

“BUT HANG ME FIRST!”

Cathy Poliak, Ph.D. [email protected] (Department of Mathematics University of Houston )First Class August 23, 2016 9 / 32

Page 10: MATH 2311 Introduction to Probability & Statistics ...

What Will Be Taught In This Course?

1. Statistical methods can be used to summarize or describe acollection of data; this is called descriptive statistics. This isuseful in research, when communicating the results ofexperiments.

2. Some tools in statistics requires the "chance" of an eventhappening. Thus we will also study probability.

3. In addition, patterns in the data may be modeled in a way thataccounts for randomness and uncertainty in the observations, andare then used to draw inferences about the process or populationbeing studied; this is called inferential statistics. Inference is avital element of scientific advance, since it provides a prediction.

See calendar online.

Cathy Poliak, Ph.D. [email protected] (Department of Mathematics University of Houston )First Class August 23, 2016 10 / 32

Page 11: MATH 2311 Introduction to Probability & Statistics ...

Learning Objectives

The student will be able to:Demonstrate the ability to compute basic descriptive statistics.

Interpret statistical data.

Understand statistical inference and interpretation.

Apply statistical concepts to actual data.

Cathy Poliak, Ph.D. [email protected] (Department of Mathematics University of Houston )First Class August 23, 2016 11 / 32

Page 12: MATH 2311 Introduction to Probability & Statistics ...

Assessments

Poppers 10%

Online Quizzes 10%

Homework 10%

3 Exams 45%

Final Exam 25%

Cathy Poliak, Ph.D. [email protected] (Department of Mathematics University of Houston )First Class August 23, 2016 12 / 32

Page 13: MATH 2311 Introduction to Probability & Statistics ...

Poppers

For each lecture starting on the third week of classes you will beasked a series of problems that will have to do with the lecture.This requires a buying a poppers package from the bookstore.Make sure that the package is for lecture 05 (section 15758).You are required to fill in your id number, popper number andblacken the correct circles. Make sure that your id number andpopper number are correct before turning in the popper at the endof the lecture. If these are not filled out correctly or if the darkencircles are too light you will not get credit for that day’s lectureeven if you attended.The total number of questions for the course will be counted, 85%of the total number of questions will be the 100%. For example, ifthere are 5 questions each class for 24 classes, which is 120questions. Your grade will be calculated out of 120(.85) = 102points.

Cathy Poliak, Ph.D. [email protected] (Department of Mathematics University of Houston )First Class August 23, 2016 13 / 32

Page 14: MATH 2311 Introduction to Probability & Statistics ...

Quizzes

The quizzes are taken through the CourseWare website. Under"online assignments"

All of the quizzes are open and will close every Saturday at 11:59pm starting on September 3rd.

The lowest quiz grade will be dropped, if you miss any that will beyour dropped quiz.

You have up to 20 times to take each quiz.

There is a 90 minute time limit for each quiz.

Taking the quizzes until you get it right is essential to helping youdo well on the exams.

There is a Course Policy Quiz that you are required to take thatyou have to get 100% on before taking any other quizzes.

Cathy Poliak, Ph.D. [email protected] (Department of Mathematics University of Houston )First Class August 23, 2016 14 / 32

Page 15: MATH 2311 Introduction to Probability & Statistics ...

Homework

There are weekly assignments due every Wednesday starting onSeptember 7th.

There are two parts to the weekly homework and each will begraded separately at 15 points each.

I One part is written homework, problems from the textbook orassigned according to the instructor.

I Second part is electronic multiple choice problems.

Both parts will be submitted in the CASA CourseWare website.See instructions on the course web page for how to upload thehomework.

Two of the lowest homework scores will be dropped.

Cathy Poliak, Ph.D. [email protected] (Department of Mathematics University of Houston )First Class August 23, 2016 15 / 32

Page 16: MATH 2311 Introduction to Probability & Statistics ...

Exams

Exam Chapters Covered DatesExam 1 1, 2 and 3 September 17, 19, & 20Exam 2 4, 5 and 6 October 27 - 29Exam 3 7 and 8 December 1 - 3Final Exam Comprehensive December 12 - 14

Cathy Poliak, Ph.D. [email protected] (Department of Mathematics University of Houston )First Class August 23, 2016 16 / 32

Page 17: MATH 2311 Introduction to Probability & Statistics ...

Instructions of Exams

All sections of Math 2311 take common exams.

All exams will be given in CASA located on the second floor ofGarrison or in CBB, see the exam scheduler for details.

You can access the scheduler for these exams by logging intoCourseware.

The scheduler will be available approximately 2 weeks prior to thestart of the exam cycle.

You must schedule a time, CASA will not allow you in with out ascheduled time.

No make up exams are given. If you miss your exam or scorelower than what you would like, the final exam score can replacethat score.

There are practice exams available. 10Cathy Poliak, Ph.D. [email protected] (Department of Mathematics University of Houston )First Class August 23, 2016 17 / 32

Page 18: MATH 2311 Introduction to Probability & Statistics ...

Textbook

The textbook, online quizzes, and additional help materials will bemade available by logging into CourseWare athttp://www.casa.uh.edu.

The first portion of these materials are freely available for the firsttwo weeks of class.

All students must purchase a Course Access Code and enter it onCourseWare no later than September 6th to continue accessingthe course learning materials.

A Course Access Code must be purchased for $55 from theUniversity Bookstore.

Cathy Poliak, Ph.D. [email protected] (Department of Mathematics University of Houston )First Class August 23, 2016 18 / 32

Page 19: MATH 2311 Introduction to Probability & Statistics ...

Computer Software

Knowledge of a statistical package is an indispensable part of themodern statistics. The class presentations, some homeworkassignments, and the exams are computer based.

The statistical package R-studio is used in this class for exploringstatistical concepts and demonstrating statistical analysis of actualdata useful for decisions. No previous knowledge of this softwareis assumed.

This software is a free package that you can download on to yourpersonal computer.

If you want to put this on your computer, you will need to downloadR and R-studio.

I R: https://cran.cnr.berkeley.edu/I R-studio: https://www.rstudio.com/

This will be available to you for your exams in CASA.Cathy Poliak, Ph.D. [email protected] (Department of Mathematics University of Houston )First Class August 23, 2016 19 / 32

Page 20: MATH 2311 Introduction to Probability & Statistics ...

Other Information

This is a challenging course. Each student is responsible for his/herlearning. If a section of the textbook and/or homework problem ispuzzling you, it is your responsibility to make an appointment to seethe instructor or a tutor as soon as possible. You are encouraged toask questions during lectures and office hours. The following are therecourses available to you for help in this course.

Instructor: You are always welcome in the instructor’s office forhelp. If the office hours are not convenient for you just email theinstructor to set up an appointment. Allow 24 hours in advance forappointments.CASA Tutoring (http://www.casa.uh.edu/casa/): GarrisonGym 222, see course webpage for times of tutors for 2311.

Cathy Poliak, Ph.D. [email protected] (Department of Mathematics University of Houston )First Class August 23, 2016 20 / 32

Page 21: MATH 2311 Introduction to Probability & Statistics ...

What is “Data”?

Statistics is the science of collecting, organizing, and interpretingnumerical facts which we call data.

The facts and figures collected,analyzed, and summarized forpresentation and interpretation.

Amount of your last purchase at a grocery store.

The number of times that you access a certain website.

Your name.

Cathy Poliak, Ph.D. [email protected] (Department of Mathematics University of Houston )First Class August 23, 2016 21 / 32

Page 22: MATH 2311 Introduction to Probability & Statistics ...

A Data Set: Course Grades From Last YearStudent Score Grade Tests Quiz HW Opt-out Session

1 100.707 A 99.233 87.308 101.270 yes Sp162 81.310 B 75 98.231 64.444 yes Sp163 8.194 F 14.667 12.769 3.175 no Sp164 90.449 A 91.533 77.231 82.222 yes Sp165 68.461 D 65.783 81.769 68.571 no Sp166 103.955 A 103.32 97.923 101.905 yes Sp167 92.889 A 95.6 85.923 75.556 no Sp168 84.805 B 83.2 79.385 75.238 yes Sp169 91.640 A 89.967 91.231 85.079 yes Sp16

10 22.316 F 17.433 40.615 44.444 no Sp1611 98.363 A 94.167 99.231 101.587 yes Sp1612 49.250 F 43.917 73.077 78.095 no Sp1613 16.967 F 15.5 20.077 29.841 no Sp1614 50.747 F 45.533 67.385 57.460 no Sp1615 43.184 F 72.983 47.462 38.413 no Sp1616 100.845 A 98.667 96.231 100.317 yes Sp1617 84.195 B 77.5 87.154 95.556 yes Sp1618 84.400 B 78.733 78.615 82.540 yes Sp1619 67.170 D 74.3 68.538 72.063 no Fal1520 87.413 B 92 82.077 77.778 yes Fal1521 67.899 D 71.8 71.077 84.127 no Fal1522 74.676 C 70.083 83.308 73.016 no Fal1523 40.054 F 44.133 21.308 33.333 no Fal1524 101.014 A 101.08 98.923 95.873 no Fal1525 11.972 F 17.1 10.385 3.810 no Fal1526 79.831 B 86.233 71.923 46.667 no Fal1527 83.301 B 94.6 69.692 60.317 no Fal1528 72.299 C 64.967 67.615 99.394 no Sum1629 83.821 B 77.2 80.923 83.030 yes Sum1630 90.703 A 83.617 87.923 80.000 no Sum16

Cathy Poliak, Ph.D. [email protected] (Department of Mathematics University of Houston )First Class August 23, 2016 22 / 32

Page 23: MATH 2311 Introduction to Probability & Statistics ...

Types of data

Population Data is everything or everyone we want informationabout. It is a set of data that consists of all possible valuespertaining to a certain set of observations or an investigation.

Sample Data is a subset of the population that we haveinformation from. It is just a small section of the population takenfor the purpose of investigation.

Cathy Poliak, Ph.D. [email protected] (Department of Mathematics University of Houston )First Class August 23, 2016 23 / 32

Page 24: MATH 2311 Introduction to Probability & Statistics ...

Examples of Types of Data

Identify the population and the sample for each of the following:University of Houston is interested in how many students buyused books as opposed to new ones. They randomly choose 100students at the student center to interview

I Population -

I Sample -An elementary school is creating a new lunch menu. They sendquestionnaires to students with last names that begin with theletters M through R.

I Population -

I Sample -

Cathy Poliak, Ph.D. [email protected] (Department of Mathematics University of Houston )First Class August 23, 2016 24 / 32

Page 25: MATH 2311 Introduction to Probability & Statistics ...

Two Types of Variables

Go back to the example of grades. We have several variables, score,grade, tests, quiz, hw, opt-out, & session.

The variables grade, opt-out, & session are categoricalvariables. Categorical variables place a case into one of severalgroups or categories.The variables scores, tests, quiz & hw is a quantitativevariable. Quantitative Variables take numerical values for whicharithmetic operations such as adding and averaging make sense.

Cathy Poliak, Ph.D. [email protected] (Department of Mathematics University of Houston )First Class August 23, 2016 25 / 32

Page 26: MATH 2311 Introduction to Probability & Statistics ...

Two Types of Quantitative Variables

Quantitative variables can be classified as either discrete orcontinuous.

Discrete quantitative variables - a countable set of values.

Continuous quantitative variables - data that can take on anyvalues within some interval.

Cathy Poliak, Ph.D. [email protected] (Department of Mathematics University of Houston )First Class August 23, 2016 26 / 32

Page 27: MATH 2311 Introduction to Probability & Statistics ...

Examples of Variables

Classify the following variables as categorical or quantitative. Ifquantitative, state whether the variable is discrete or continuous.

Political preference.

Number of siblings.

Cathy Poliak, Ph.D. [email protected] (Department of Mathematics University of Houston )First Class August 23, 2016 27 / 32

Page 28: MATH 2311 Introduction to Probability & Statistics ...

Examples of Variables Part 2

Classify the following variables as categorical or quantitative. Ifquantitative, state whether the variable is discrete or continuous.

Blood type.

Height of men on a professional basketball team.

Time it takes to be on hold when calling the IRS at tax time.

Cathy Poliak, Ph.D. [email protected] (Department of Mathematics University of Houston )First Class August 23, 2016 28 / 32

Page 29: MATH 2311 Introduction to Probability & Statistics ...

Describing Data By Graphs

Graphs are an easy and quick way to describe the data.

Types of graphs that we use depends on the type of data that wehave.

Graphs for categorical variables.I Bar graphs: Each individual bar represents a category and the

height of each of the bars are either represented by the count orpercent.

I Pie charts: Helps us see what part of the whole each group forms.

Graphs for quantiative variables.I DotplotI StemplotI HistogramI Boxplot

Cathy Poliak, Ph.D. [email protected] (Department of Mathematics University of Houston )First Class August 23, 2016 29 / 32

Page 30: MATH 2311 Introduction to Probability & Statistics ...

Bar Graph of Letter Grades

A B C D F

02

46

8

Cathy Poliak, Ph.D. [email protected] (Department of Mathematics University of Houston )First Class August 23, 2016 30 / 32

Page 31: MATH 2311 Introduction to Probability & Statistics ...

Pie Chart of Letter Grades

A

B

C

DF

Cathy Poliak, Ph.D. [email protected] (Department of Mathematics University of Houston )First Class August 23, 2016 31 / 32

Page 32: MATH 2311 Introduction to Probability & Statistics ...

R code

For bar graph: plot(datasetname$variablename)

For pie chart:> counts<-table(grades$Grade)> pie(counts)

Cathy Poliak, Ph.D. [email protected] (Department of Mathematics University of Houston )First Class August 23, 2016 32 / 32