How to Use the R Programming Language for Statistical … · STATISTIK INFERENSIAL . Estimasi -1 ....

47
How to Use the R Programming Language for Statistical Analyses An Introduction to R (diperkaya dari beberapa sumber tutorial R)

Transcript of How to Use the R Programming Language for Statistical … · STATISTIK INFERENSIAL . Estimasi -1 ....

Page 1: How to Use the R Programming Language for Statistical … · STATISTIK INFERENSIAL . Estimasi -1 . Estimasi -2 . Exercise – One tailed Test (Known variance) > xbar = 9900 # sample

How to Use the R Programming

Language for Statistical Analyses An Introduction to R

(diperkaya dari beberapa sumber tutorial R)

Page 2: How to Use the R Programming Language for Statistical … · STATISTIK INFERENSIAL . Estimasi -1 . Estimasi -2 . Exercise – One tailed Test (Known variance) > xbar = 9900 # sample

What Is R?

• a programming “environment”

• object-oriented

• similar to S-Plus

• freeware

• provides calculations on matrices

• excellent graphics capabilities

• supported by a large user network

Page 3: How to Use the R Programming Language for Statistical … · STATISTIK INFERENSIAL . Estimasi -1 . Estimasi -2 . Exercise – One tailed Test (Known variance) > xbar = 9900 # sample

What is R Not?

• a statistics software package

• menu-driven

• quick to learn

• a program with a complex graphical interface

Page 4: How to Use the R Programming Language for Statistical … · STATISTIK INFERENSIAL . Estimasi -1 . Estimasi -2 . Exercise – One tailed Test (Known variance) > xbar = 9900 # sample

Installing R

• www.r-project.org/

• download from CRAN

• select a download site

• download the base package at a minimum

• download contributed packages as needed

Page 5: How to Use the R Programming Language for Statistical … · STATISTIK INFERENSIAL . Estimasi -1 . Estimasi -2 . Exercise – One tailed Test (Known variance) > xbar = 9900 # sample
Page 6: How to Use the R Programming Language for Statistical … · STATISTIK INFERENSIAL . Estimasi -1 . Estimasi -2 . Exercise – One tailed Test (Known variance) > xbar = 9900 # sample
Page 7: How to Use the R Programming Language for Statistical … · STATISTIK INFERENSIAL . Estimasi -1 . Estimasi -2 . Exercise – One tailed Test (Known variance) > xbar = 9900 # sample
Page 8: How to Use the R Programming Language for Statistical … · STATISTIK INFERENSIAL . Estimasi -1 . Estimasi -2 . Exercise – One tailed Test (Known variance) > xbar = 9900 # sample
Page 9: How to Use the R Programming Language for Statistical … · STATISTIK INFERENSIAL . Estimasi -1 . Estimasi -2 . Exercise – One tailed Test (Known variance) > xbar = 9900 # sample

Tutorials

• From R website under “Documentation”

• “Manual” is the listing of official R documentation

• An Introduction to R

• R Language Definition

• Writing R Extensions

• R Data Import/Export

• R Installation and Administration

• The R Reference Index

Page 10: How to Use the R Programming Language for Statistical … · STATISTIK INFERENSIAL . Estimasi -1 . Estimasi -2 . Exercise – One tailed Test (Known variance) > xbar = 9900 # sample

Tutorials cont.

• “Contributed” documentation are tutorials and manuals created by

R users

• Simple R

• R for Beginners

• Practical Regression and ANOVA Using R

• R FAQ

• Mailing Lists (listserv)

• r-help

Page 11: How to Use the R Programming Language for Statistical … · STATISTIK INFERENSIAL . Estimasi -1 . Estimasi -2 . Exercise – One tailed Test (Known variance) > xbar = 9900 # sample

Tutorials cont.

• Textbooks

• Venables & Ripley (2002) Modern Applied Statistics with S. New

York: Springer-Verlag.

• Chambers (1998). Programming With Data: A guide to the S

language. New York: Springer-Verlag.

Page 12: How to Use the R Programming Language for Statistical … · STATISTIK INFERENSIAL . Estimasi -1 . Estimasi -2 . Exercise – One tailed Test (Known variance) > xbar = 9900 # sample

R Basics

• objects

• naming convention

• assignment

• functions

• workspace

• history

Page 13: How to Use the R Programming Language for Statistical … · STATISTIK INFERENSIAL . Estimasi -1 . Estimasi -2 . Exercise – One tailed Test (Known variance) > xbar = 9900 # sample

Objects

• names

• types of objects: vector, factor, array, matrix,

data.frame, ts, list

• attributes

• mode: numeric, character, complex, logical

• length: number of elements in object

• creation

• assign a value

• create a blank object

Page 14: How to Use the R Programming Language for Statistical … · STATISTIK INFERENSIAL . Estimasi -1 . Estimasi -2 . Exercise – One tailed Test (Known variance) > xbar = 9900 # sample

Naming Convention

• must start with a letter (A-Z or a-z)

• can contain letters, digits (0-9), and/or periods “.”

• case-sensitive

• mydata different from MyData

• do not use underscore “_”

Page 15: How to Use the R Programming Language for Statistical … · STATISTIK INFERENSIAL . Estimasi -1 . Estimasi -2 . Exercise – One tailed Test (Known variance) > xbar = 9900 # sample

Assignment

• “<-” used to indicate assignment • x<-c(1,2,3,4,5,6,7)

• x<-c(1:7)

• x<-1:4

Page 16: How to Use the R Programming Language for Statistical … · STATISTIK INFERENSIAL . Estimasi -1 . Estimasi -2 . Exercise – One tailed Test (Known variance) > xbar = 9900 # sample

Functions

• actions can be performed on objects using functions

(note: a function is itself an object)

• have arguments and options, often there are defaults

• provide a result

• parentheses () are used to specify that a function is being

called

Page 17: How to Use the R Programming Language for Statistical … · STATISTIK INFERENSIAL . Estimasi -1 . Estimasi -2 . Exercise – One tailed Test (Known variance) > xbar = 9900 # sample

Let’s look at R

Page 18: How to Use the R Programming Language for Statistical … · STATISTIK INFERENSIAL . Estimasi -1 . Estimasi -2 . Exercise – One tailed Test (Known variance) > xbar = 9900 # sample
Page 19: How to Use the R Programming Language for Statistical … · STATISTIK INFERENSIAL . Estimasi -1 . Estimasi -2 . Exercise – One tailed Test (Known variance) > xbar = 9900 # sample
Page 20: How to Use the R Programming Language for Statistical … · STATISTIK INFERENSIAL . Estimasi -1 . Estimasi -2 . Exercise – One tailed Test (Known variance) > xbar = 9900 # sample
Page 21: How to Use the R Programming Language for Statistical … · STATISTIK INFERENSIAL . Estimasi -1 . Estimasi -2 . Exercise – One tailed Test (Known variance) > xbar = 9900 # sample
Page 22: How to Use the R Programming Language for Statistical … · STATISTIK INFERENSIAL . Estimasi -1 . Estimasi -2 . Exercise – One tailed Test (Known variance) > xbar = 9900 # sample

R Workspace & History

Page 23: How to Use the R Programming Language for Statistical … · STATISTIK INFERENSIAL . Estimasi -1 . Estimasi -2 . Exercise – One tailed Test (Known variance) > xbar = 9900 # sample

Workspace

• during an R session, all objects are stored in a temporary,

working memory

• list objects • ls()

• remove objects • rm()

• objects that you want to access later must be saved in a

“workspace”

• from the menu bar: File->save workspace

• from the command line: save(x,file=“MyData.Rdata”)

Page 24: How to Use the R Programming Language for Statistical … · STATISTIK INFERENSIAL . Estimasi -1 . Estimasi -2 . Exercise – One tailed Test (Known variance) > xbar = 9900 # sample
Page 25: How to Use the R Programming Language for Statistical … · STATISTIK INFERENSIAL . Estimasi -1 . Estimasi -2 . Exercise – One tailed Test (Known variance) > xbar = 9900 # sample

History

• command line history

• can be saved, loaded, or displayed • savehistory(file=“MyData.Rhistory)

• loadhistory(file=“MyData.Rhistory)

• history(max.show=Inf)

• during a session you can use the arrow keys to review the

command history

Page 26: How to Use the R Programming Language for Statistical … · STATISTIK INFERENSIAL . Estimasi -1 . Estimasi -2 . Exercise – One tailed Test (Known variance) > xbar = 9900 # sample
Page 27: How to Use the R Programming Language for Statistical … · STATISTIK INFERENSIAL . Estimasi -1 . Estimasi -2 . Exercise – One tailed Test (Known variance) > xbar = 9900 # sample

Two most common object types for

statistics:

matrix

data frame

Page 28: How to Use the R Programming Language for Statistical … · STATISTIK INFERENSIAL . Estimasi -1 . Estimasi -2 . Exercise – One tailed Test (Known variance) > xbar = 9900 # sample

Matrix

• a matrix is a vector with an additional attribute (dim) that

defines the number of columns and rows

• only one mode (numeric, character, complex, or logical)

allowed

• can be created using matrix()

x<-matrix(data=0,nr=2,nc=2)

or

x<-matrix(0,2,2)

x=matrix(c(1,2,3,4,5,6,7,8,9),nr=3,nc=3)

Page 29: How to Use the R Programming Language for Statistical … · STATISTIK INFERENSIAL . Estimasi -1 . Estimasi -2 . Exercise – One tailed Test (Known variance) > xbar = 9900 # sample

Data Elements

• select only one element • x[2]

• select range of elements • x[1:3]

• select all but one element • x[-3]

• slicing: including only part of the object • x[c(1,2,5)]

• select elements based on logical operator • x(x>3)

Page 30: How to Use the R Programming Language for Statistical … · STATISTIK INFERENSIAL . Estimasi -1 . Estimasi -2 . Exercise – One tailed Test (Known variance) > xbar = 9900 # sample

Basic Statistics

1. Qualitative Data

2. Quantitative Data

a. Descriptive

b. Classification

c. Plotting

d. Variance

e. Analysis of Variance

Page 31: How to Use the R Programming Language for Statistical … · STATISTIK INFERENSIAL . Estimasi -1 . Estimasi -2 . Exercise – One tailed Test (Known variance) > xbar = 9900 # sample

Exercise

• Open library : data(airquality)

• Select coloumn Ozone : airquality$Temp

• Several descriptive calculation: mean()

median()

Range : max() and min() or range()

quantile()

Percentile : quantile(data,c(P1,P2,P3,…))

Standard Deviation : sd()

Correlation: cor(x,y)

summary()

Page 32: How to Use the R Programming Language for Statistical … · STATISTIK INFERENSIAL . Estimasi -1 . Estimasi -2 . Exercise – One tailed Test (Known variance) > xbar = 9900 # sample

Exercise

• Open library : data(airquality)

• Select coloumn Ozone : airquality$Temp • Range : max() and min() or range()

• class<-seq(a,b,by=..)

• A.cut<-cut(A,class,right=FALSE)

Page 33: How to Use the R Programming Language for Statistical … · STATISTIK INFERENSIAL . Estimasi -1 . Estimasi -2 . Exercise – One tailed Test (Known variance) > xbar = 9900 # sample

Exercise

• Plotting several information:

• Histogram : hist ()

• Barplot : barplot()

• color=c(“red”,”yellow”,”green”,”blue”,”cyan”)

• barplot(..,col=color)

• Pie Chart : pie()

• Scatter plot : plot(x,y)

• Boxplot: boxplot(x,y)

• Cummulative Plot : plot(x,y); x = class, y=c(0,cumsum())

Page 34: How to Use the R Programming Language for Statistical … · STATISTIK INFERENSIAL . Estimasi -1 . Estimasi -2 . Exercise – One tailed Test (Known variance) > xbar = 9900 # sample

STATISTIK INFERENSIAL

Page 35: How to Use the R Programming Language for Statistical … · STATISTIK INFERENSIAL . Estimasi -1 . Estimasi -2 . Exercise – One tailed Test (Known variance) > xbar = 9900 # sample

Estimasi -1

Page 36: How to Use the R Programming Language for Statistical … · STATISTIK INFERENSIAL . Estimasi -1 . Estimasi -2 . Exercise – One tailed Test (Known variance) > xbar = 9900 # sample

Estimasi -2

Page 37: How to Use the R Programming Language for Statistical … · STATISTIK INFERENSIAL . Estimasi -1 . Estimasi -2 . Exercise – One tailed Test (Known variance) > xbar = 9900 # sample
Page 38: How to Use the R Programming Language for Statistical … · STATISTIK INFERENSIAL . Estimasi -1 . Estimasi -2 . Exercise – One tailed Test (Known variance) > xbar = 9900 # sample
Page 39: How to Use the R Programming Language for Statistical … · STATISTIK INFERENSIAL . Estimasi -1 . Estimasi -2 . Exercise – One tailed Test (Known variance) > xbar = 9900 # sample
Page 40: How to Use the R Programming Language for Statistical … · STATISTIK INFERENSIAL . Estimasi -1 . Estimasi -2 . Exercise – One tailed Test (Known variance) > xbar = 9900 # sample
Page 41: How to Use the R Programming Language for Statistical … · STATISTIK INFERENSIAL . Estimasi -1 . Estimasi -2 . Exercise – One tailed Test (Known variance) > xbar = 9900 # sample

Exercise – One tailed Test (Known variance)

> xbar = 9900 # sample mean

> mu0 = 10000 # hypothesized value

> sigma = 120 # population standard deviation

> n = 30 # sample size

> z = (xbar−mu0)/(sigma/sqrt(n))

> z # test statistic

[1] −4.5644

> alpha = .05

> z.alpha = qnorm(1−alpha)

> -z.alpha # critical value

[1] −1.6449

Page 42: How to Use the R Programming Language for Statistical … · STATISTIK INFERENSIAL . Estimasi -1 . Estimasi -2 . Exercise – One tailed Test (Known variance) > xbar = 9900 # sample

Exercise

• Buka data airquality, pilih Temp

• Hitung rerata Temp dan standar deviasinya.

• Buat sample secara random dari data Temp sebanyak

100 data

• Y<- sample(X,100)

• Hitung rerata dari sample Y

• Hitung Nilai Z sample dan ujilah null hypothesis

Page 43: How to Use the R Programming Language for Statistical … · STATISTIK INFERENSIAL . Estimasi -1 . Estimasi -2 . Exercise – One tailed Test (Known variance) > xbar = 9900 # sample

Exercise – Two tailed Test (Known variance)

• > xbar = 14.6 # sample mean

> mu0 = 15.4 # hypothesized value

> sigma = 2.5 # population standard deviation

> n = 35 # sample size

> z = (xbar−mu0)/(sigma/sqrt(n))

> z # test statistic

[1] −1.8931

• > alpha = .05

> z.half.alpha = qnorm(1−alpha/2)

> c(−z.half.alpha, z.half.alpha)

[1] −1.9600 1.9600

Page 44: How to Use the R Programming Language for Statistical … · STATISTIK INFERENSIAL . Estimasi -1 . Estimasi -2 . Exercise – One tailed Test (Known variance) > xbar = 9900 # sample

Exercise – One tailed Test (Unknown variance)

• > xbar = 9900 # sample mean

> mu0 = 10000 # hypothesized value

> s = 125 # sample standard deviation

> n = 30 # sample size

> t = (xbar−mu0)/(s/sqrt(n))

> t # test statistic

[1] −4.3818

• > alpha = .05

> t.alpha = qt(1−alpha, df=n−1)

> −t.alpha # critical value

[1] −1.6991

Page 45: How to Use the R Programming Language for Statistical … · STATISTIK INFERENSIAL . Estimasi -1 . Estimasi -2 . Exercise – One tailed Test (Known variance) > xbar = 9900 # sample

Exercise

• Buka data airquality, pilih Temp

• Hitung rerata Temp dan standar deviasinya.

• Buat sample secara random dari data Temp sebanyak

100 data

• Y<- sample(X,100)

• Hitung rerata dari sample Y

• Hitung Nilai Z sample dan ujilah null hypothesis

Page 46: How to Use the R Programming Language for Statistical … · STATISTIK INFERENSIAL . Estimasi -1 . Estimasi -2 . Exercise – One tailed Test (Known variance) > xbar = 9900 # sample

Exercise – Two tailed Test (Unknown variance)

• > xbar = 14.6 # sample mean

> mu0 = 15.4 # hypothesized value

> s = 2.5 # sample standard deviation

> n = 35 # sample size

> t = (xbar−mu0)/(s/sqrt(n))

> t # test statistic

[1] −1.8931

• > alpha = .05

> t.half.alpha = qt(1−alpha/2, df=n−1)

> c(−t.half.alpha, t.half.alpha)

[1] −2.0322 2.0322

Page 47: How to Use the R Programming Language for Statistical … · STATISTIK INFERENSIAL . Estimasi -1 . Estimasi -2 . Exercise – One tailed Test (Known variance) > xbar = 9900 # sample

Alternative – Two tailed Test (Unknown variance)

• > pval = 2 ∗ pt(t, df=n−1) # lower tail

> pval # two−tailed p−value

[1] 0.066876

• Uji pval> 0.05 Do not Reject

• Uji pval <=0.05 Reject

• t.test(x,y) #uji t untuk variable x dan y