Introductory Statistics with R - UCLA Statistical...

59
Prelim. Data Descriptive Statistics Prob. Models Hyp. Test & CI Linear Reg. Resources Upcoming Exercises UCLA Department of Statistics Statistical Consulting Center Introductory Statistics with R Presented by Kekona Sorenson [email protected] , Prepared by: Mine C ¸etinkaya [email protected] November 9, 2010 Presented by Kekona Sorenson [email protected] , Prepared by: Mine C ¸etinkaya [email protected] Introductory Statistics with R UCLA SCC

Transcript of Introductory Statistics with R - UCLA Statistical...

Page 1: Introductory Statistics with R - UCLA Statistical …scc.stat.ucla.edu/page_attachments/0000/0195/Intro_Stats_with_R.pdf5 Hypothesis Testing and Con dence Intervals 6 Linear Regression

Prelim. Data Descriptive Statistics Prob. Models Hyp. Test & CI Linear Reg. Resources Upcoming Exercises

UCLA Department of StatisticsStatistical Consulting Center

Introductory Statistics with R

Presented by Kekona Sorenson [email protected] ,Prepared by: Mine Cetinkaya [email protected]

November 9, 2010

Presented by Kekona Sorenson [email protected] , Prepared by: Mine Cetinkaya [email protected]

Introductory Statistics with R UCLA SCC

Page 2: Introductory Statistics with R - UCLA Statistical …scc.stat.ucla.edu/page_attachments/0000/0195/Intro_Stats_with_R.pdf5 Hypothesis Testing and Con dence Intervals 6 Linear Regression

Prelim. Data Descriptive Statistics Prob. Models Hyp. Test & CI Linear Reg. Resources Upcoming Exercises

Outline

1 Preliminaries

2 Data sets

3 Descriptive Statistics

4 Probability Models

5 Hypothesis Testing and Confidence Intervals

6 Linear Regression

7 Online Resources for R

8 Upcoming Mini-Courses

9 ExercisesPresented by Kekona Sorenson [email protected] , Prepared by: Mine Cetinkaya [email protected]

Introductory Statistics with R UCLA SCC

Page 3: Introductory Statistics with R - UCLA Statistical …scc.stat.ucla.edu/page_attachments/0000/0195/Intro_Stats_with_R.pdf5 Hypothesis Testing and Con dence Intervals 6 Linear Regression

Prelim. Data Descriptive Statistics Prob. Models Hyp. Test & CI Linear Reg. Resources Upcoming Exercises

1 PreliminariesSoftware InstallationR Help

2 Data sets

3 Descriptive Statistics

4 Probability Models

5 Hypothesis Testing and Confidence Intervals

6 Linear Regression

7 Online Resources for R

8 Upcoming Mini-Courses

9 ExercisesPresented by Kekona Sorenson [email protected] , Prepared by: Mine Cetinkaya [email protected]

Introductory Statistics with R UCLA SCC

Page 4: Introductory Statistics with R - UCLA Statistical …scc.stat.ucla.edu/page_attachments/0000/0195/Intro_Stats_with_R.pdf5 Hypothesis Testing and Con dence Intervals 6 Linear Regression

Prelim. Data Descriptive Statistics Prob. Models Hyp. Test & CI Linear Reg. Resources Upcoming Exercises

Software Installation

Installing R on a Mac

1 Go tohttp://cran.r-project.org/

and select MacOS X

2 Select to download thelatest version: 2.11.0(2010-04-22)

3 Install and Open. The Rwindow should look like this:

Presented by Kekona Sorenson [email protected] , Prepared by: Mine Cetinkaya [email protected]

Introductory Statistics with R UCLA SCC

Page 5: Introductory Statistics with R - UCLA Statistical …scc.stat.ucla.edu/page_attachments/0000/0195/Intro_Stats_with_R.pdf5 Hypothesis Testing and Con dence Intervals 6 Linear Regression

Prelim. Data Descriptive Statistics Prob. Models Hyp. Test & CI Linear Reg. Resources Upcoming Exercises

R Help

R Help

For help with any function in R,put a question mark before thefunction name to determine whatarguments to use, examples andbackground information.

1 ?plot

Presented by Kekona Sorenson [email protected] , Prepared by: Mine Cetinkaya [email protected]

Introductory Statistics with R UCLA SCC

Page 6: Introductory Statistics with R - UCLA Statistical …scc.stat.ucla.edu/page_attachments/0000/0195/Intro_Stats_with_R.pdf5 Hypothesis Testing and Con dence Intervals 6 Linear Regression

Prelim. Data Descriptive Statistics Prob. Models Hyp. Test & CI Linear Reg. Resources Upcoming Exercises

1 Preliminaries

2 Data setsLoading data into RViewing data sets in R

3 Descriptive Statistics

4 Probability Models

5 Hypothesis Testing and Confidence Intervals

6 Linear Regression

7 Online Resources for R

8 Upcoming Mini-Courses

9 ExercisesPresented by Kekona Sorenson [email protected] , Prepared by: Mine Cetinkaya [email protected]

Introductory Statistics with R UCLA SCC

Page 7: Introductory Statistics with R - UCLA Statistical …scc.stat.ucla.edu/page_attachments/0000/0195/Intro_Stats_with_R.pdf5 Hypothesis Testing and Con dence Intervals 6 Linear Regression

Prelim. Data Descriptive Statistics Prob. Models Hyp. Test & CI Linear Reg. Resources Upcoming Exercises

Loading data into R

Loading data into R

Loading a data set into R:

1 survey = read.table("http://www.stat.ucla.

edu/~mine/students_survey_2008. txt",

header = TRUE , sep = "\t")

Displaying the dimensions of the data set:

1 dim(survey)

[1] 1325 29

Presented by Kekona Sorenson [email protected] , Prepared by: Mine Cetinkaya [email protected]

Introductory Statistics with R UCLA SCC

Page 8: Introductory Statistics with R - UCLA Statistical …scc.stat.ucla.edu/page_attachments/0000/0195/Intro_Stats_with_R.pdf5 Hypothesis Testing and Con dence Intervals 6 Linear Regression

Prelim. Data Descriptive Statistics Prob. Models Hyp. Test & CI Linear Reg. Resources Upcoming Exercises

Viewing data sets in R

Viewing data sets in R

Displaying the first 3 rows and 5 columns of the data set:

1 survey [1:3 ,1:5]

gender hand eyecolor glasses california

1 female left hazel yes yes

2 male right brown no no

3 female right brown yes yes

Displaying the variable names in the data set:

1 names(survey)

[1] "gender" "hand" "eyecolor" "glasses" "california"

[6] "birthmonth" "birthday" "birthyear" "ageinmonths" "height"

[11] "graduate" "oncampus" "time" "walk" "hsclass"

...

Presented by Kekona Sorenson [email protected] , Prepared by: Mine Cetinkaya [email protected]

Introductory Statistics with R UCLA SCC

Page 9: Introductory Statistics with R - UCLA Statistical …scc.stat.ucla.edu/page_attachments/0000/0195/Intro_Stats_with_R.pdf5 Hypothesis Testing and Con dence Intervals 6 Linear Regression

Prelim. Data Descriptive Statistics Prob. Models Hyp. Test & CI Linear Reg. Resources Upcoming Exercises

Viewing data sets in R

Attaching / detaching data frames in RAttaching the variables in a data set::

1 attach(survey)

The following object(s) are masked from package:datasets :

sleep

The warning is telling us that we have attached a data framethat contains a column, whose name is sleep. If you type:

1 sleep

the object with that name in the data frame will be seenbefore another object with the same name that is lower in thesearch() path. Thus, your object is “masking” the other.To detach a data frame, i.e. remove from the search() pathof available R objects - but we won’t do that now.

1 detach(sleep)

Presented by Kekona Sorenson [email protected] , Prepared by: Mine Cetinkaya [email protected]

Introductory Statistics with R UCLA SCC

Page 10: Introductory Statistics with R - UCLA Statistical …scc.stat.ucla.edu/page_attachments/0000/0195/Intro_Stats_with_R.pdf5 Hypothesis Testing and Con dence Intervals 6 Linear Regression

Prelim. Data Descriptive Statistics Prob. Models Hyp. Test & CI Linear Reg. Resources Upcoming Exercises

1 Preliminaries

2 Data sets

3 Descriptive StatisticsVariable classesDisplaying categorical dataDisplaying quantitative dataDescribing distributions numerically

4 Probability Models

5 Hypothesis Testing and Confidence Intervals

6 Linear Regression

7 Online Resources for R

8 Upcoming Mini-Courses

9 Exercises

Presented by Kekona Sorenson [email protected] , Prepared by: Mine Cetinkaya [email protected]

Introductory Statistics with R UCLA SCC

Page 11: Introductory Statistics with R - UCLA Statistical …scc.stat.ucla.edu/page_attachments/0000/0195/Intro_Stats_with_R.pdf5 Hypothesis Testing and Con dence Intervals 6 Linear Regression

Prelim. Data Descriptive Statistics Prob. Models Hyp. Test & CI Linear Reg. Resources Upcoming Exercises

Variable classes

Displaying the class of a variable:

1 class(instructor)

[1] "factor"

Changing the class of a variable:

1 instructor = as.character(instructor)

2 class(instructor)

[1] "character"

Presented by Kekona Sorenson [email protected] , Prepared by: Mine Cetinkaya [email protected]

Introductory Statistics with R UCLA SCC

Page 12: Introductory Statistics with R - UCLA Statistical …scc.stat.ucla.edu/page_attachments/0000/0195/Intro_Stats_with_R.pdf5 Hypothesis Testing and Con dence Intervals 6 Linear Regression

Prelim. Data Descriptive Statistics Prob. Models Hyp. Test & CI Linear Reg. Resources Upcoming Exercises

Displaying categorical data

Tables

Tables are useful for displaying the distribution of categoricalvariables.

1 table(gender)

gender

female male

882 443

Presented by Kekona Sorenson [email protected] , Prepared by: Mine Cetinkaya [email protected]

Introductory Statistics with R UCLA SCC

Page 13: Introductory Statistics with R - UCLA Statistical …scc.stat.ucla.edu/page_attachments/0000/0195/Intro_Stats_with_R.pdf5 Hypothesis Testing and Con dence Intervals 6 Linear Regression

Prelim. Data Descriptive Statistics Prob. Models Hyp. Test & CI Linear Reg. Resources Upcoming Exercises

Displaying categorical data

Contingency tables

Contingency tables display two categorical variables at a time.

1 table(gender , hand)

hand

gender ambidextrous left right

female 9 67 806

male 11 45 387

Presented by Kekona Sorenson [email protected] , Prepared by: Mine Cetinkaya [email protected]

Introductory Statistics with R UCLA SCC

Page 14: Introductory Statistics with R - UCLA Statistical …scc.stat.ucla.edu/page_attachments/0000/0195/Intro_Stats_with_R.pdf5 Hypothesis Testing and Con dence Intervals 6 Linear Regression

Prelim. Data Descriptive Statistics Prob. Models Hyp. Test & CI Linear Reg. Resources Upcoming Exercises

Displaying categorical data

Frequency bar plotsDisplay counts of each category next to each other for easycomparison.

1 barplot(table(gender), main = "Barplot of

Gender")

female male

Barplot of Gender

0200

600

Presented by Kekona Sorenson [email protected] , Prepared by: Mine Cetinkaya [email protected]

Introductory Statistics with R UCLA SCC

Page 15: Introductory Statistics with R - UCLA Statistical …scc.stat.ucla.edu/page_attachments/0000/0195/Intro_Stats_with_R.pdf5 Hypothesis Testing and Con dence Intervals 6 Linear Regression

Prelim. Data Descriptive Statistics Prob. Models Hyp. Test & CI Linear Reg. Resources Upcoming Exercises

Displaying categorical data

Relative frequency bar plots

Display relative proportions of each category.

1 barplot(table(gender)/length(gender), main = "

Relative Frequency \n Barplot of Gender")

female male

Relative Frequency

Barplot of Gender

0.0

0.3

0.6

Presented by Kekona Sorenson [email protected] , Prepared by: Mine Cetinkaya [email protected]

Introductory Statistics with R UCLA SCC

Page 16: Introductory Statistics with R - UCLA Statistical …scc.stat.ucla.edu/page_attachments/0000/0195/Intro_Stats_with_R.pdf5 Hypothesis Testing and Con dence Intervals 6 Linear Regression

Prelim. Data Descriptive Statistics Prob. Models Hyp. Test & CI Linear Reg. Resources Upcoming Exercises

Displaying categorical data

Segmented bar chartsDisplays two categorical variables at a time.

1 barplot(table(gender , hand), col = c("skyblue"

, "blue"), main = "Segmented Bar Plot \n

of Gender")

2 legend("topleft", c("females","males"), col =

c("skyblue", "blue"), pch = 16, inset =

0.05)

ambidextrous left right

Segmented Bar Plot

of Gender

0200400600800

females

males

Presented by Kekona Sorenson [email protected] , Prepared by: Mine Cetinkaya [email protected]

Introductory Statistics with R UCLA SCC

Page 17: Introductory Statistics with R - UCLA Statistical …scc.stat.ucla.edu/page_attachments/0000/0195/Intro_Stats_with_R.pdf5 Hypothesis Testing and Con dence Intervals 6 Linear Regression

Prelim. Data Descriptive Statistics Prob. Models Hyp. Test & CI Linear Reg. Resources Upcoming Exercises

Displaying categorical data

Pie chartsPie charts display counts as percentages of individuals in eachcategory.

1 pct = round(table(gender) / length(gender) *

100)

2 lbls = paste(names(table(gender)), "\n", "%",

pct)

3 pie(table(gender), labels = lbls)

female

% 67

male

% 33

Presented by Kekona Sorenson [email protected] , Prepared by: Mine Cetinkaya [email protected]

Introductory Statistics with R UCLA SCC

Page 18: Introductory Statistics with R - UCLA Statistical …scc.stat.ucla.edu/page_attachments/0000/0195/Intro_Stats_with_R.pdf5 Hypothesis Testing and Con dence Intervals 6 Linear Regression

Prelim. Data Descriptive Statistics Prob. Models Hyp. Test & CI Linear Reg. Resources Upcoming Exercises

Displaying quantitative data

HistogramsDisplay the number of cases in each bin

1 hist(ageinmonths , main = "Histogram of Age in

Months")

Histogram of Age in Months

ageinmonths

Frequency

200 250 300 350

0100

200

300

400

Presented by Kekona Sorenson [email protected] , Prepared by: Mine Cetinkaya [email protected]

Introductory Statistics with R UCLA SCC

Page 19: Introductory Statistics with R - UCLA Statistical …scc.stat.ucla.edu/page_attachments/0000/0195/Intro_Stats_with_R.pdf5 Hypothesis Testing and Con dence Intervals 6 Linear Regression

Prelim. Data Descriptive Statistics Prob. Models Hyp. Test & CI Linear Reg. Resources Upcoming Exercises

Displaying quantitative data

Relative frequency histogramsDisplay the proportion of of cases in each bin.

1 hist(ageinmonths , freq = FALSE , main = "

Relative Frequency \n Histogram of Age in

Months", xlab = "Age in Months")

Relative Frequency

Histogram of Age in Months

Age in Months

Density

200 250 300 350

0.000

0.010

0.020

0.030

Presented by Kekona Sorenson [email protected] , Prepared by: Mine Cetinkaya [email protected]

Introductory Statistics with R UCLA SCC

Page 20: Introductory Statistics with R - UCLA Statistical …scc.stat.ucla.edu/page_attachments/0000/0195/Intro_Stats_with_R.pdf5 Hypothesis Testing and Con dence Intervals 6 Linear Regression

Prelim. Data Descriptive Statistics Prob. Models Hyp. Test & CI Linear Reg. Resources Upcoming Exercises

Displaying quantitative data

Stem-and-Leaf Plots

Preserve individual data values.

1 stem(ageinmonths)

The decimal point is 1 digit(s) to the right of the |

20 | 48

21 | 004444555566666666666666666777777777778888888888889999999999999999

22 | 00000000000000000000000000111111111111111122222222222222222222333333+258

23 | 00000000000000000000000000000000000000000000001111111111111111111111+379

24 | 00000000000000000000000000000000000000000000111111111111111111111111+170

25 | 00000000000001111111111111112222222222222222222223333333344444444445+24

26 | 000000000001111111111222222333334444444444556666778889

27 | 00111222222344566789

28 | 01334558888

29 | 0004569

30 | 267

31 | 02257

32 | 44

33 | 5

34 | 89

35 | 3

Presented by Kekona Sorenson [email protected] , Prepared by: Mine Cetinkaya [email protected]

Introductory Statistics with R UCLA SCC

Page 21: Introductory Statistics with R - UCLA Statistical …scc.stat.ucla.edu/page_attachments/0000/0195/Intro_Stats_with_R.pdf5 Hypothesis Testing and Con dence Intervals 6 Linear Regression

Prelim. Data Descriptive Statistics Prob. Models Hyp. Test & CI Linear Reg. Resources Upcoming Exercises

Displaying quantitative data

Boxplots1 boxplot(ageinmonths , main = "Boxplot of Age in

Months")

200

250

300

350

Boxplot of Age in Months

Five Number Summary (Min, Q1, Median, Q3, Max):

1 fivenum(ageinmonths)

[1] 204 228 235 243 353

Presented by Kekona Sorenson [email protected] , Prepared by: Mine Cetinkaya [email protected]

Introductory Statistics with R UCLA SCC

Page 22: Introductory Statistics with R - UCLA Statistical …scc.stat.ucla.edu/page_attachments/0000/0195/Intro_Stats_with_R.pdf5 Hypothesis Testing and Con dence Intervals 6 Linear Regression

Prelim. Data Descriptive Statistics Prob. Models Hyp. Test & CI Linear Reg. Resources Upcoming Exercises

Describing distributions numerically

Summary

Categorical variables:

1 summary(hand)

ambidextrous left right

20 112 1193

Quantitative variables:

1 summary(ageinmonths)

Min. 1st Qu. Median Mean 3rd Qu. Max.

204.0 228.0 235.0 237.8 243.0 353.0

Presented by Kekona Sorenson [email protected] , Prepared by: Mine Cetinkaya [email protected]

Introductory Statistics with R UCLA SCC

Page 23: Introductory Statistics with R - UCLA Statistical …scc.stat.ucla.edu/page_attachments/0000/0195/Intro_Stats_with_R.pdf5 Hypothesis Testing and Con dence Intervals 6 Linear Regression

Prelim. Data Descriptive Statistics Prob. Models Hyp. Test & CI Linear Reg. Resources Upcoming Exercises

Describing distributions numerically

Measures of center

Mean (arithmetic average):

1 mean(ageinmonths)

[1] 237.8309

Median (value that divides the histogram into two equalareas):

1 median(ageinmonths)

[1] 235

Mode (the most frequent value): for discrete data

1 as.numeric(names(sort(table(ageinmonths),

decreasing = TRUE))[1])

[1] 228

Presented by Kekona Sorenson [email protected] , Prepared by: Mine Cetinkaya [email protected]

Introductory Statistics with R UCLA SCC

Page 24: Introductory Statistics with R - UCLA Statistical …scc.stat.ucla.edu/page_attachments/0000/0195/Intro_Stats_with_R.pdf5 Hypothesis Testing and Con dence Intervals 6 Linear Regression

Prelim. Data Descriptive Statistics Prob. Models Hyp. Test & CI Linear Reg. Resources Upcoming Exercises

Describing distributions numerically

Mode (alternative)

To find the mode, you may also use the Mode function in theprettyR package.

1 install.packages("prettyR")

2 library(prettyR)

3 Mode(ageinmonths)

[1] "228"

Presented by Kekona Sorenson [email protected] , Prepared by: Mine Cetinkaya [email protected]

Introductory Statistics with R UCLA SCC

Page 25: Introductory Statistics with R - UCLA Statistical …scc.stat.ucla.edu/page_attachments/0000/0195/Intro_Stats_with_R.pdf5 Hypothesis Testing and Con dence Intervals 6 Linear Regression

Prelim. Data Descriptive Statistics Prob. Models Hyp. Test & CI Linear Reg. Resources Upcoming Exercises

Describing distributions numerically

Adding measures to plotsAdding mean and median to a histogram.

1 hist(ageinmonths , main = "Histogram of Age in

Months")

2 abline(v = mean(ageinmonths), col = "blue")

3 abline(v = median(ageinmonths), col = "green")

4 legend("topright", c("Mean", "Median"), pch =

16, col = c("blue", "green"))

Histogram of Age in Months

ageinmonths

Frequency

200 250 300 350

0100

200

300

400

Mean

Median

Presented by Kekona Sorenson [email protected] , Prepared by: Mine Cetinkaya [email protected]

Introductory Statistics with R UCLA SCC

Page 26: Introductory Statistics with R - UCLA Statistical …scc.stat.ucla.edu/page_attachments/0000/0195/Intro_Stats_with_R.pdf5 Hypothesis Testing and Con dence Intervals 6 Linear Regression

Prelim. Data Descriptive Statistics Prob. Models Hyp. Test & CI Linear Reg. Resources Upcoming Exercises

Describing distributions numerically

Measures of spread

Range (Min, Max):

1 range(ageinmonths)

[1] 204 353

IQR:

1 IQR(ageinmonths)

[1] 15

Standard deviation:

1 sd(ageinmonths)

[1] 16.03965

Presented by Kekona Sorenson [email protected] , Prepared by: Mine Cetinkaya [email protected]

Introductory Statistics with R UCLA SCC

Page 27: Introductory Statistics with R - UCLA Statistical …scc.stat.ucla.edu/page_attachments/0000/0195/Intro_Stats_with_R.pdf5 Hypothesis Testing and Con dence Intervals 6 Linear Regression

Prelim. Data Descriptive Statistics Prob. Models Hyp. Test & CI Linear Reg. Resources Upcoming Exercises

1 Preliminaries

2 Data sets

3 Descriptive Statistics

4 Probability ModelsGeometricBinomialPoissonNormal

5 Hypothesis Testing and Confidence Intervals

6 Linear Regression

7 Online Resources for R

8 Upcoming Mini-Courses

9 Exercises

Presented by Kekona Sorenson [email protected] , Prepared by: Mine Cetinkaya [email protected]

Introductory Statistics with R UCLA SCC

Page 28: Introductory Statistics with R - UCLA Statistical …scc.stat.ucla.edu/page_attachments/0000/0195/Intro_Stats_with_R.pdf5 Hypothesis Testing and Con dence Intervals 6 Linear Regression

Prelim. Data Descriptive Statistics Prob. Models Hyp. Test & CI Linear Reg. Resources Upcoming Exercises

Geometric

Geometric distribution

If the probability of success is 0.35, what is the probability that thefirst success will be on the 5th trial?

1 dgeom (4 ,0.35)

[1] 0.06247719

Note: dgeom gives the density (or probability mass function for discrete

variables), pgeom gives the distribution function, qgeom gives the

quantile function, and rgeom generates random deviates. This is true for

the functions used for Binomial, Poisson and Normal calculations as well.

Presented by Kekona Sorenson [email protected] , Prepared by: Mine Cetinkaya [email protected]

Introductory Statistics with R UCLA SCC

Page 29: Introductory Statistics with R - UCLA Statistical …scc.stat.ucla.edu/page_attachments/0000/0195/Intro_Stats_with_R.pdf5 Hypothesis Testing and Con dence Intervals 6 Linear Regression

Prelim. Data Descriptive Statistics Prob. Models Hyp. Test & CI Linear Reg. Resources Upcoming Exercises

Binomial

Binomial distribution

If the probability of success is 0.35, what is the probability of

3 successes in 5 trials?

1 dbinom (3 ,5 ,0.35)

[1] 0.1811469

at least 3 successes in 5 trials?

1 sum(dbinom (3:5 ,5 ,0.35))

[1] 0.2351694

Presented by Kekona Sorenson [email protected] , Prepared by: Mine Cetinkaya [email protected]

Introductory Statistics with R UCLA SCC

Page 30: Introductory Statistics with R - UCLA Statistical …scc.stat.ucla.edu/page_attachments/0000/0195/Intro_Stats_with_R.pdf5 Hypothesis Testing and Con dence Intervals 6 Linear Regression

Prelim. Data Descriptive Statistics Prob. Models Hyp. Test & CI Linear Reg. Resources Upcoming Exercises

Poisson

Poisson distribution

The number of traffic accidents per week in a small city hasPoisson distribution with mean equal to 3. What is the probabilityof

two accidents in a week?

1 dpois (2,3)

[1] 0.2240418

at most one accident in a week?

1 sum(dpois (0:1 ,3))

[1] 0.1991483

Presented by Kekona Sorenson [email protected] , Prepared by: Mine Cetinkaya [email protected]

Introductory Statistics with R UCLA SCC

Page 31: Introductory Statistics with R - UCLA Statistical …scc.stat.ucla.edu/page_attachments/0000/0195/Intro_Stats_with_R.pdf5 Hypothesis Testing and Con dence Intervals 6 Linear Regression

Prelim. Data Descriptive Statistics Prob. Models Hyp. Test & CI Linear Reg. Resources Upcoming Exercises

Normal

Normal distribution

Scores on an exam are distributed normally with a mean of 65 anda standard deviation of 12. What percentage of the students havescores

below 50?

1 pnorm (50 ,65 ,12)

[1] 0.1056498

between 50 and 70?

1 pnorm (70 ,65 ,12)-pnorm (50 ,65 ,12)

[1] 0.5558891

Presented by Kekona Sorenson [email protected] , Prepared by: Mine Cetinkaya [email protected]

Introductory Statistics with R UCLA SCC

Page 32: Introductory Statistics with R - UCLA Statistical …scc.stat.ucla.edu/page_attachments/0000/0195/Intro_Stats_with_R.pdf5 Hypothesis Testing and Con dence Intervals 6 Linear Regression

Prelim. Data Descriptive Statistics Prob. Models Hyp. Test & CI Linear Reg. Resources Upcoming Exercises

Normal

Normal distribution (cont.)

What is the 90th percentile of the score distribution?

1 qnorm (.90 ,65 ,12)

[1] 80.37862

Presented by Kekona Sorenson [email protected] , Prepared by: Mine Cetinkaya [email protected]

Introductory Statistics with R UCLA SCC

Page 33: Introductory Statistics with R - UCLA Statistical …scc.stat.ucla.edu/page_attachments/0000/0195/Intro_Stats_with_R.pdf5 Hypothesis Testing and Con dence Intervals 6 Linear Regression

Prelim. Data Descriptive Statistics Prob. Models Hyp. Test & CI Linear Reg. Resources Upcoming Exercises

1 Preliminaries

2 Data sets

3 Descriptive Statistics

4 Probability Models

5 Hypothesis Testing and Confidence IntervalsOne sample meansTwo sample meansOne sample proportionsTwo sample proportions

6 Linear Regression

7 Online Resources for R

8 Upcoming Mini-Courses

9 Exercises

Presented by Kekona Sorenson [email protected] , Prepared by: Mine Cetinkaya [email protected]

Introductory Statistics with R UCLA SCC

Page 34: Introductory Statistics with R - UCLA Statistical …scc.stat.ucla.edu/page_attachments/0000/0195/Intro_Stats_with_R.pdf5 Hypothesis Testing and Con dence Intervals 6 Linear Regression

Prelim. Data Descriptive Statistics Prob. Models Hyp. Test & CI Linear Reg. Resources Upcoming Exercises

One sample means

Hypothesis testing for one sample means

Is there evidence to suggest that the average age in months forStats 10 students is more than 235 months? Use α = 0.05.

1 sample100 = sample (1:1325 , 100, replace =

FALSE)

2 survey.sub = survey[sample100 ,]

3 t.test(survey.sub$ageinmonths , alternative = "

greater", mu = 235, conf.level = 0.95)

One Sample t-test

data: survey.sub$ageinmonths

t = 1.5922, df = 99, p-value = 0.05726

alternative hypothesis: true mean is greater than 235

95 percent confidence interval:

234.9118 Inf

sample estimates:

mean of x

237.06

Presented by Kekona Sorenson [email protected] , Prepared by: Mine Cetinkaya [email protected]

Introductory Statistics with R UCLA SCC

Page 35: Introductory Statistics with R - UCLA Statistical …scc.stat.ucla.edu/page_attachments/0000/0195/Intro_Stats_with_R.pdf5 Hypothesis Testing and Con dence Intervals 6 Linear Regression

Prelim. Data Descriptive Statistics Prob. Models Hyp. Test & CI Linear Reg. Resources Upcoming Exercises

One sample means

Confidence intervals for one sample means

The t.test function prints out a confidence interval as well.

However this function returns a one-sided interval when thealternative is "greater" or "less".

When alternative = "greater" is chosen the lowerconfidence bound is calculated and the upper bound is givenas Inf by default.

When alternative = "less" is chosen the upperconfidence bound is calculated and the lower bound is givenas -Inf by default.

When alternative = "two.sided" is chosen both theupper and the lower confidence bounds are calculated.

Presented by Kekona Sorenson [email protected] , Prepared by: Mine Cetinkaya [email protected]

Introductory Statistics with R UCLA SCC

Page 36: Introductory Statistics with R - UCLA Statistical …scc.stat.ucla.edu/page_attachments/0000/0195/Intro_Stats_with_R.pdf5 Hypothesis Testing and Con dence Intervals 6 Linear Regression

Prelim. Data Descriptive Statistics Prob. Models Hyp. Test & CI Linear Reg. Resources Upcoming Exercises

One sample means

Confidence intervals for one sample means (cont.)

1 t.test(survey.sub$ageinmonths , alternative = "

two.sided", mu = 235, conf.level = 0.90)

One Sample t-test

data: survey.sub$ageinmonths

t = 1.5922, df = 99, p-value = 0.1145

alternative hypothesis: true mean is not equal to 235

90 percent confidence interval:

234.9118 239.2082

sample estimates:

mean of x

237.06

Note that we changed the confidence level to 0.90 in order tocorrespond to a one-sided hypothesis test with α = 0.05.

Presented by Kekona Sorenson [email protected] , Prepared by: Mine Cetinkaya [email protected]

Introductory Statistics with R UCLA SCC

Page 37: Introductory Statistics with R - UCLA Statistical …scc.stat.ucla.edu/page_attachments/0000/0195/Intro_Stats_with_R.pdf5 Hypothesis Testing and Con dence Intervals 6 Linear Regression

Prelim. Data Descriptive Statistics Prob. Models Hyp. Test & CI Linear Reg. Resources Upcoming Exercises

One sample means

Confidence intervals for one sample means (cont.)

Alternative calculation of confidence interval:

1 onesample.mean.ci = function(x, conf.level){

2 tstar = -qt(p = ((1 - conf.level)/2), df = (

length(x) - 1))

3 xbar = mean(x)

4 sexbar = sd(x) / sqrt(length(x))

5 cilower = xbar - tstar * sexbar

6 ciupper = xbar + tstar * sexbar

7 return(list = c(cilower , ciupper))

8 }

9 onesample.mean.ci(survey.sub$ageinmonths ,

0.90)

[1] 234.9118 239.2082

Presented by Kekona Sorenson [email protected] , Prepared by: Mine Cetinkaya [email protected]

Introductory Statistics with R UCLA SCC

Page 38: Introductory Statistics with R - UCLA Statistical …scc.stat.ucla.edu/page_attachments/0000/0195/Intro_Stats_with_R.pdf5 Hypothesis Testing and Con dence Intervals 6 Linear Regression

Prelim. Data Descriptive Statistics Prob. Models Hyp. Test & CI Linear Reg. Resources Upcoming Exercises

Two sample means

Hypothesis testing and CI for two sample meansIs there a difference between the ages of females and males?Construct a 95% confidence interval for the difference between theaverage ages of females and males.

1 t.test(survey.sub$ageinmonths[survey.

sub$gender == "female"], survey.

sub$ageinmonths[survey.sub$gender == "male

"], alternative = "two.sided", conf.level

= 0.95)

Welch Two Sample t-test

data: survey.sub$ageinmonths[survey.sub$gender == "female"] and

survey.sub$ageinmonths[survey.sub$gender == "male"]

t = 1.25, df = 95.736, p-value = 0.2143

alternative hypothesis: true difference in means is not equal to 0

95 percent confidence interval:

-1.765100 7.768572

sample estimates:

mean of x mean of y

238.1406 235.1389

Presented by Kekona Sorenson [email protected] , Prepared by: Mine Cetinkaya [email protected]

Introductory Statistics with R UCLA SCC

Page 39: Introductory Statistics with R - UCLA Statistical …scc.stat.ucla.edu/page_attachments/0000/0195/Intro_Stats_with_R.pdf5 Hypothesis Testing and Con dence Intervals 6 Linear Regression

Prelim. Data Descriptive Statistics Prob. Models Hyp. Test & CI Linear Reg. Resources Upcoming Exercises

One sample proportions

Hypothesis testing for one sample proportions

64 out of 100 students in a random sample are females. Is thereevidence to suggest that the population proportion of females isless than 65%? Use a 90% confidence level.

1 prop.test(64, 100, p = 0.65, alternative = "

less", conf.level = 0.90)

1-sample proportions test with continuity correction

data: 64 out of 100, null probability 0.65

X-squared = 0.011, df = 1, p-value = 0.4583

alternative hypothesis: true p is less than 0.65

90 percent confidence interval:

0.0000000 0.7035286

sample estimates:

p

0.64

Presented by Kekona Sorenson [email protected] , Prepared by: Mine Cetinkaya [email protected]

Introductory Statistics with R UCLA SCC

Page 40: Introductory Statistics with R - UCLA Statistical …scc.stat.ucla.edu/page_attachments/0000/0195/Intro_Stats_with_R.pdf5 Hypothesis Testing and Con dence Intervals 6 Linear Regression

Prelim. Data Descriptive Statistics Prob. Models Hyp. Test & CI Linear Reg. Resources Upcoming Exercises

One sample proportions

Confidence intervals for one sample proportions

Just like the t.test, the prop.test function will calculate boththe upper and the lower bounds of the confidence interval onlywhen alternative = "two.sided" is chosen. Otherwise a lowerbound of 0 or an upper bound of 1 is produced.

1 prop.test(64, 100, p = 0.65, alternative = "

two.sided", conf.level = 0.80)

1-sample proportions test with continuity correction

data: 64 out of 100, null probability 0.65

X-squared = 0.011, df = 1, p-value = 0.9165

alternative hypothesis: true p is not equal to 0.65

80 percent confidence interval:

0.5715825 0.7035286

sample estimates:

p

0.64

Presented by Kekona Sorenson [email protected] , Prepared by: Mine Cetinkaya [email protected]

Introductory Statistics with R UCLA SCC

Page 41: Introductory Statistics with R - UCLA Statistical …scc.stat.ucla.edu/page_attachments/0000/0195/Intro_Stats_with_R.pdf5 Hypothesis Testing and Con dence Intervals 6 Linear Regression

Prelim. Data Descriptive Statistics Prob. Models Hyp. Test & CI Linear Reg. Resources Upcoming Exercises

Two sample proportions

Hypothesis testing and CI for two sample proportions

54 out of 64 females and 32 out of 36 males are right handed. Isthere evidence to suggest that proportions of males and femaleswho are right handed are different?

1 prop.test(c(54 ,32), c(64 ,36))

2-sample test for equality of proportions with continuity correction

data: c(54, 32) out of c(64, 36)

X-squared = 0.1051, df = 1, p-value = 0.7458

alternative hypothesis: two.sided

95 percent confidence interval:

-0.2026789 0.1124012

sample estimates:

prop 1 prop 2

0.8437500 0.8888889

Presented by Kekona Sorenson [email protected] , Prepared by: Mine Cetinkaya [email protected]

Introductory Statistics with R UCLA SCC

Page 42: Introductory Statistics with R - UCLA Statistical …scc.stat.ucla.edu/page_attachments/0000/0195/Intro_Stats_with_R.pdf5 Hypothesis Testing and Con dence Intervals 6 Linear Regression

Prelim. Data Descriptive Statistics Prob. Models Hyp. Test & CI Linear Reg. Resources Upcoming Exercises

1 Preliminaries

2 Data sets

3 Descriptive Statistics

4 Probability Models

5 Hypothesis Testing and Confidence Intervals

6 Linear RegressionScatterplots, Association, and CorrelationSimple Linear Regression

7 Online Resources for R

8 Upcoming Mini-Courses

9 ExercisesPresented by Kekona Sorenson [email protected] , Prepared by: Mine Cetinkaya [email protected]

Introductory Statistics with R UCLA SCC

Page 43: Introductory Statistics with R - UCLA Statistical …scc.stat.ucla.edu/page_attachments/0000/0195/Intro_Stats_with_R.pdf5 Hypothesis Testing and Con dence Intervals 6 Linear Regression

Prelim. Data Descriptive Statistics Prob. Models Hyp. Test & CI Linear Reg. Resources Upcoming Exercises

Scatterplots, Association, and Correlation

ScatterplotsIs there an association between amount of alcohol consumed andmaximum speed?

1 plot(speed ~ alcohol , main = "Scatterplot of

Speed vs. Alcohol", pch = 20, cex = 0.5)

0 20 40 60 80

050

100

150

Scatterplot of Speed vs. Alcohol

alcohol

speed

Presented by Kekona Sorenson [email protected] , Prepared by: Mine Cetinkaya [email protected]

Introductory Statistics with R UCLA SCC

Page 44: Introductory Statistics with R - UCLA Statistical …scc.stat.ucla.edu/page_attachments/0000/0195/Intro_Stats_with_R.pdf5 Hypothesis Testing and Con dence Intervals 6 Linear Regression

Prelim. Data Descriptive Statistics Prob. Models Hyp. Test & CI Linear Reg. Resources Upcoming Exercises

Scatterplots, Association, and Correlation

Correlation

1 cor(alcohol , speed , use = "pairwise.complete.

obs")

[1] 0.2309745

Presented by Kekona Sorenson [email protected] , Prepared by: Mine Cetinkaya [email protected]

Introductory Statistics with R UCLA SCC

Page 45: Introductory Statistics with R - UCLA Statistical …scc.stat.ucla.edu/page_attachments/0000/0195/Intro_Stats_with_R.pdf5 Hypothesis Testing and Con dence Intervals 6 Linear Regression

Prelim. Data Descriptive Statistics Prob. Models Hyp. Test & CI Linear Reg. Resources Upcoming Exercises

Simple Linear Regression

Simple Linear Regression

Build a linear regression model predicting speed from alcohol.

1 summary(lm(speed~alcohol))

Call:

lm(formula = speed ~ alcohol)

Residuals:

Min 1Q Median 3Q Max

-90.769 -8.725 1.275 11.275 91.541

Coefficients:

Estimate Std. Error t value Pr(>|t|)

(Intercept) 88.7248 0.6511 136.261 <2e-16 ***

alcohol 0.9469 0.1108 8.549 <2e-16 ***

---

Signif. codes: 0 ’***’ 0.001 ’**’ 0.01 ’*’ 0.05 ’.’ 0.1 ’ ’ 1

Residual standard error: 21.83 on 1297 degrees of freedom

(26 observations deleted due to missingness)

Multiple R-squared: 0.05335, Adjusted R-squared: 0.05262

F-statistic: 73.09 on 1 and 1297 DF, p-value: < 2.2e-16

Presented by Kekona Sorenson [email protected] , Prepared by: Mine Cetinkaya [email protected]

Introductory Statistics with R UCLA SCC

Page 46: Introductory Statistics with R - UCLA Statistical …scc.stat.ucla.edu/page_attachments/0000/0195/Intro_Stats_with_R.pdf5 Hypothesis Testing and Con dence Intervals 6 Linear Regression

Prelim. Data Descriptive Statistics Prob. Models Hyp. Test & CI Linear Reg. Resources Upcoming Exercises

1 Preliminaries

2 Data sets

3 Descriptive Statistics

4 Probability Models

5 Hypothesis Testing and Confidence Intervals

6 Linear Regression

7 Online Resources for R

8 Upcoming Mini-Courses

9 ExercisesPresented by Kekona Sorenson [email protected] , Prepared by: Mine Cetinkaya [email protected]

Introductory Statistics with R UCLA SCC

Page 47: Introductory Statistics with R - UCLA Statistical …scc.stat.ucla.edu/page_attachments/0000/0195/Intro_Stats_with_R.pdf5 Hypothesis Testing and Con dence Intervals 6 Linear Regression

Prelim. Data Descriptive Statistics Prob. Models Hyp. Test & CI Linear Reg. Resources Upcoming Exercises

Online Resources for R

Download R: http://cran.stat.ucla.edu/

Search Engine for R: rseek.org

R Reference Card:http://cran.r-project.org/doc/contrib/Short-refcard.pdf

UCLA Statistics Information Portal: http:// info.stat.ucla.edu/grad/

UCLA Statistical Consulting Center: http:// scc.stat.ucla.edu

Presented by Kekona Sorenson [email protected] , Prepared by: Mine Cetinkaya [email protected]

Introductory Statistics with R UCLA SCC

Page 48: Introductory Statistics with R - UCLA Statistical …scc.stat.ucla.edu/page_attachments/0000/0195/Intro_Stats_with_R.pdf5 Hypothesis Testing and Con dence Intervals 6 Linear Regression

Prelim. Data Descriptive Statistics Prob. Models Hyp. Test & CI Linear Reg. Resources Upcoming Exercises

Online Resources for R

Download R: http://cran.stat.ucla.edu/

Search Engine for R: rseek.org

R Reference Card:http://cran.r-project.org/doc/contrib/Short-refcard.pdf

UCLA Statistics Information Portal: http:// info.stat.ucla.edu/grad/

UCLA Statistical Consulting Center: http:// scc.stat.ucla.edu

Presented by Kekona Sorenson [email protected] , Prepared by: Mine Cetinkaya [email protected]

Introductory Statistics with R UCLA SCC

Page 49: Introductory Statistics with R - UCLA Statistical …scc.stat.ucla.edu/page_attachments/0000/0195/Intro_Stats_with_R.pdf5 Hypothesis Testing and Con dence Intervals 6 Linear Regression

Prelim. Data Descriptive Statistics Prob. Models Hyp. Test & CI Linear Reg. Resources Upcoming Exercises

Online Resources for R

Download R: http://cran.stat.ucla.edu/

Search Engine for R: rseek.org

R Reference Card:http://cran.r-project.org/doc/contrib/Short-refcard.pdf

UCLA Statistics Information Portal: http:// info.stat.ucla.edu/grad/

UCLA Statistical Consulting Center: http:// scc.stat.ucla.edu

Presented by Kekona Sorenson [email protected] , Prepared by: Mine Cetinkaya [email protected]

Introductory Statistics with R UCLA SCC

Page 50: Introductory Statistics with R - UCLA Statistical …scc.stat.ucla.edu/page_attachments/0000/0195/Intro_Stats_with_R.pdf5 Hypothesis Testing and Con dence Intervals 6 Linear Regression

Prelim. Data Descriptive Statistics Prob. Models Hyp. Test & CI Linear Reg. Resources Upcoming Exercises

Online Resources for R

Download R: http://cran.stat.ucla.edu/

Search Engine for R: rseek.org

R Reference Card:http://cran.r-project.org/doc/contrib/Short-refcard.pdf

UCLA Statistics Information Portal: http:// info.stat.ucla.edu/grad/

UCLA Statistical Consulting Center: http:// scc.stat.ucla.edu

Presented by Kekona Sorenson [email protected] , Prepared by: Mine Cetinkaya [email protected]

Introductory Statistics with R UCLA SCC

Page 51: Introductory Statistics with R - UCLA Statistical …scc.stat.ucla.edu/page_attachments/0000/0195/Intro_Stats_with_R.pdf5 Hypothesis Testing and Con dence Intervals 6 Linear Regression

Prelim. Data Descriptive Statistics Prob. Models Hyp. Test & CI Linear Reg. Resources Upcoming Exercises

Online Resources for R

Download R: http://cran.stat.ucla.edu/

Search Engine for R: rseek.org

R Reference Card:http://cran.r-project.org/doc/contrib/Short-refcard.pdf

UCLA Statistics Information Portal: http:// info.stat.ucla.edu/grad/

UCLA Statistical Consulting Center: http:// scc.stat.ucla.edu

Presented by Kekona Sorenson [email protected] , Prepared by: Mine Cetinkaya [email protected]

Introductory Statistics with R UCLA SCC

Page 52: Introductory Statistics with R - UCLA Statistical …scc.stat.ucla.edu/page_attachments/0000/0195/Intro_Stats_with_R.pdf5 Hypothesis Testing and Con dence Intervals 6 Linear Regression

Prelim. Data Descriptive Statistics Prob. Models Hyp. Test & CI Linear Reg. Resources Upcoming Exercises

1 Preliminaries

2 Data sets

3 Descriptive Statistics

4 Probability Models

5 Hypothesis Testing and Confidence Intervals

6 Linear Regression

7 Online Resources for R

8 Upcoming Mini-Courses

9 ExercisesPresented by Kekona Sorenson [email protected] , Prepared by: Mine Cetinkaya [email protected]

Introductory Statistics with R UCLA SCC

Page 53: Introductory Statistics with R - UCLA Statistical …scc.stat.ucla.edu/page_attachments/0000/0195/Intro_Stats_with_R.pdf5 Hypothesis Testing and Con dence Intervals 6 Linear Regression

Prelim. Data Descriptive Statistics Prob. Models Hyp. Test & CI Linear Reg. Resources Upcoming Exercises

Upcoming Mini-Courses

November 16th, R Stats II: Linear Regression

November 23rd, R Stats III: Generalized Linear Models

For a schedule of all mini-courses offered please visithttp:// scc.stat.ucla.edu/mini-courses .

Presented by Kekona Sorenson [email protected] , Prepared by: Mine Cetinkaya [email protected]

Introductory Statistics with R UCLA SCC

Page 54: Introductory Statistics with R - UCLA Statistical …scc.stat.ucla.edu/page_attachments/0000/0195/Intro_Stats_with_R.pdf5 Hypothesis Testing and Con dence Intervals 6 Linear Regression

Prelim. Data Descriptive Statistics Prob. Models Hyp. Test & CI Linear Reg. Resources Upcoming Exercises

Thank youAny questions?

Presented by Kekona Sorenson [email protected] , Prepared by: Mine Cetinkaya [email protected]

Introductory Statistics with R UCLA SCC

Page 55: Introductory Statistics with R - UCLA Statistical …scc.stat.ucla.edu/page_attachments/0000/0195/Intro_Stats_with_R.pdf5 Hypothesis Testing and Con dence Intervals 6 Linear Regression

Prelim. Data Descriptive Statistics Prob. Models Hyp. Test & CI Linear Reg. Resources Upcoming Exercises

1 Preliminaries

2 Data sets

3 Descriptive Statistics

4 Probability Models

5 Hypothesis Testing and Confidence Intervals

6 Linear Regression

7 Online Resources for R

8 Upcoming Mini-Courses

9 ExercisesPresented by Kekona Sorenson [email protected] , Prepared by: Mine Cetinkaya [email protected]

Introductory Statistics with R UCLA SCC

Page 56: Introductory Statistics with R - UCLA Statistical …scc.stat.ucla.edu/page_attachments/0000/0195/Intro_Stats_with_R.pdf5 Hypothesis Testing and Con dence Intervals 6 Linear Regression

Prelim. Data Descriptive Statistics Prob. Models Hyp. Test & CI Linear Reg. Resources Upcoming Exercises

Exercises

1 Construct side-by-side box plots for the distribution of amountof time it takes students to get to class (time) by their meansof transportation (walk).

2 Usually younger students live on campus and older studentslive off campus. Is there evidence to suggest this trend in thisdata set? (Use a random sample of 100 students andα = 0.05.)

3 Calculate a 90% confidence interval for the difference betweenthe average ages of students who live on campus and offcampus.

Presented by Kekona Sorenson [email protected] , Prepared by: Mine Cetinkaya [email protected]

Introductory Statistics with R UCLA SCC

Page 57: Introductory Statistics with R - UCLA Statistical …scc.stat.ucla.edu/page_attachments/0000/0195/Intro_Stats_with_R.pdf5 Hypothesis Testing and Con dence Intervals 6 Linear Regression

Prelim. Data Descriptive Statistics Prob. Models Hyp. Test & CI Linear Reg. Resources Upcoming Exercises

Solution to Exercise 1

1 boxplot(time ~ walk , main = "Time to get to

class \n by type of transportation")

bicycle bus car (by yourself) carpool motorcycle other segway skateboard walk

050

100

150

Time to get to class

by type of transportation

Minutes

Presented by Kekona Sorenson [email protected] , Prepared by: Mine Cetinkaya [email protected]

Introductory Statistics with R UCLA SCC

Page 58: Introductory Statistics with R - UCLA Statistical …scc.stat.ucla.edu/page_attachments/0000/0195/Intro_Stats_with_R.pdf5 Hypothesis Testing and Con dence Intervals 6 Linear Regression

Prelim. Data Descriptive Statistics Prob. Models Hyp. Test & CI Linear Reg. Resources Upcoming Exercises

Solution to Exercise 2

1 t.test(survey.sub$ageinmonths[survey.

sub$oncampus == "yes"], survey.

sub$ageinmonths[survey.sub$oncampus == "no

"], alternative = "less", conf.level =

0.95)

Welch Two Sample t-test

data: survey.sub$ageinmonths[survey.sub$oncampus == "yes"] and

survey.sub$ageinmonths[survey.sub$oncampus == "no"]

t = -5.3322, df = 34.867, p-value = 2.964e-06

alternative hypothesis: true difference in means is less than 0

95 percent confidence interval:

-Inf -10.85376

sample estimates:

mean of x mean of y

232.6111 248.5000

Presented by Kekona Sorenson [email protected] , Prepared by: Mine Cetinkaya [email protected]

Introductory Statistics with R UCLA SCC

Page 59: Introductory Statistics with R - UCLA Statistical …scc.stat.ucla.edu/page_attachments/0000/0195/Intro_Stats_with_R.pdf5 Hypothesis Testing and Con dence Intervals 6 Linear Regression

Prelim. Data Descriptive Statistics Prob. Models Hyp. Test & CI Linear Reg. Resources Upcoming Exercises

Solution to Exercise 3

1 t.test(survey.sub$ageinmonths[survey.

sub$oncampus == "yes"], survey.

sub$ageinmonths[survey.sub$oncampus == "no

"], alternative = "two.sided", conf.level

= 0.90)

Welch Two Sample t-test

data: survey.sub$ageinmonths[survey.sub$oncampus == "yes"] and

survey.sub$ageinmonths[survey.sub$oncampus == "no"]

t = -5.3322, df = 34.867, p-value = 5.929e-06

alternative hypothesis: true difference in means is not equal to 0

90 percent confidence interval:

-20.92402 -10.85376

sample estimates:

mean of x mean of y

232.6111 248.5000

Presented by Kekona Sorenson [email protected] , Prepared by: Mine Cetinkaya [email protected]

Introductory Statistics with R UCLA SCC