Introduction to R 2020-21 - people.umass.edupeople.umass.edu/biep540w/pdf/R Handout Fall 2020...

18
R Handout 2020-21 Introduction to the Console, R Script, and R Markdown R handout Fall 2020 Console R Script R Markdown Page 1 of 18 Introduction to R 2020-21 Console, R Script, and R Markdown Suggested Practices and Tips Introduction Console Best for use as a calculator Okay for use to set working directory TIP … do your work in R script and/or R Markdown instead R Script Best for writing and debugging commands TIP - Consider creating a permanent R script file that contains boiler commands R Markdown Essential for best practices production of reproducible data management and analysis SUGGESTION – Produce an R Markdown for every analysis you do Table of Contents Page 1. Suggested Practice: Structure Your R Work ………..…………………………………….. 1.1. How to Set your Working Directory to your Desktop ………………………………….. 1.2 Choose a directory (“file path”, “folder”) for your R Work ……………………………… 3 3 5 2. Tips: Working Directly in the Console Pane ………………………………………………… 6 3. Suggested Practice: Work with R Scripts (Source Pane) ………………………………….. 7 4. Suggested Practice: Archive Your Work with R Markdown (Source Pane) …………….. 10 5. Your Turn: Illustration ……………………………………………..……………………….. 16

Transcript of Introduction to R 2020-21 - people.umass.edupeople.umass.edu/biep540w/pdf/R Handout Fall 2020...

Page 1: Introduction to R 2020-21 - people.umass.edupeople.umass.edu/biep540w/pdf/R Handout Fall 2020 Console R Scri… · 31/08/2020  · Introduction to R 2020-21 Console, R Script, and

R Handout 2020-21 Introduction to the Console, R Script, and R Markdown

R handout Fall 2020 Console R Script R Markdown Page 1 of 18

Introduction to R

2020-21

Console, R Script, and R Markdown Suggested Practices and Tips

Introduction

Console

Best for use as a calculator Okay for use to set working directory

TIP … do your work in R script and/or R Markdown instead

R Script Best for writing and debugging commands

TIP - Consider creating a permanent R script file that contains boiler commands

R Markdown

Essential for best practices production of reproducible data management and analysis SUGGESTION – Produce an R Markdown for every analysis you do

Table of Contents

Page 1. Suggested Practice: Structure Your R Work ………..…………………………………….. 1.1. How to Set your Working Directory to your Desktop ………………………………….. 1.2 Choose a directory (“file path”, “folder”) for your R Work ………………………………

3 3 5

2. Tips: Working Directly in the Console Pane …………………………………………………

6

3. Suggested Practice: Work with R Scripts (Source Pane) …………………………………..

7

4. Suggested Practice: Archive Your Work with R Markdown (Source Pane) ……………..

10

5. Your Turn: Illustration ……………………………………………..………………………..

16

Page 2: Introduction to R 2020-21 - people.umass.edupeople.umass.edu/biep540w/pdf/R Handout Fall 2020 Console R Scri… · 31/08/2020  · Introduction to R 2020-21 Console, R Script, and

R Handout 2020-21 Introduction to the Console, R Script, and R Markdown

R handout Fall 2020 Console R Script R Markdown Page 2 of 18

Before You Begin 1. Be sure you have downloaded and installed R 2. Be sure you have downloaded and installed R Studio 3. Download from the course website the excel file toy.Rdatax. Place on your desktop. 4. (One time if you have not done so already). Install the package DescTools 5. (One time if you have not done so already). Install the package stargazer Recall the “how to” for installing a package. Here I show you how I installed the package swirl. You will need to change “swirl” to “DescTools” and “stargazer”

At top right, click on the tab Packages Then click on Install Example: I want to install the package called swirl In Install from: default (Repository CRAN) is fine In Packages (separate multiple with space or comma:) type in name of package (e.g. swirl) In Install to Library: leave as is Check box for “Install dependencies”: check this At bottom, click Install

Page 3: Introduction to R 2020-21 - people.umass.edupeople.umass.edu/biep540w/pdf/R Handout Fall 2020 Console R Scri… · 31/08/2020  · Introduction to R 2020-21 Console, R Script, and

R Handout 2020-21 Introduction to the Console, R Script, and R Markdown

R handout Fall 2020 Console R Script R Markdown Page 3 of 18

1. Suggested Practice: Structure Your R Work

1.1 How to Set your Working Directory to Your Desktop What is the working directory in R Studio? The working directory in R Studio is the file path (spoiler: a file path is the same as a folder) where R Studio will: a) look to find something (unless you tell it otherwise) and b) save things that you save (unless you tell R Studio to save to a different location.

Example – Windows c:\540\mystuff Example – MAC /Users/cbigelow/Desktop/1. Teaching/

Recall. How to set the working directory to be your desktop From the tool bar at top: Session > Set Working Directory > Choose Directory

Page 4: Introduction to R 2020-21 - people.umass.edupeople.umass.edu/biep540w/pdf/R Handout Fall 2020 Console R Scri… · 31/08/2020  · Introduction to R 2020-21 Console, R Script, and

R Handout 2020-21 Introduction to the Console, R Script, and R Markdown

R handout Fall 2020 Console R Script R Markdown Page 4 of 18

(Optional – in case you ever need this). How to Determine the File Path of your Desktop (Note – This is done outside R Studio).

Windows Users __1. Launch windows explorer: <windows> - E __2. Navigate until you see Desktop at left __3. RIGHT CLICK on “Desktop”. A drop down menu will appear __4. From this drop down menu, at bottom, click on PROPERTIES __5. Scroll down to read desktop path at location: __6. Tip – Highlight to select and copy, for pasting into your R session later

MAC Users __1. Launch FINDER __2. At left, scroll down to find Desktop __3. RIGHT CLICK on “Desktop”. __4. Scroll down to read desktop path at where: __5. Tip – Highlight to select and copy, for pasting into your R session later

What could go wrong. In R Studio, suppose you want your working directory to be your desktop

Using command setwd( ) __1. Outside of R, highlight, select and copy the full path name __2. Paste the full path in the command setwd(“FULLPATHNAME”) __3 Tip – The path MUST be enclosed in quotes

Examples – Windows setwd(“c:/Desktop/bigelow”) setwd(“c:\\Desktop\\bigelow”) What could go wrong - Windows __#1. The leading “c” MUST be lower case __#2. Single backward slashes will NOT work. Example – MAC setwd(“/Users/cbigelow/Desktop”)

Page 5: Introduction to R 2020-21 - people.umass.edupeople.umass.edu/biep540w/pdf/R Handout Fall 2020 Console R Scri… · 31/08/2020  · Introduction to R 2020-21 Console, R Script, and

R Handout 2020-21 Introduction to the Console, R Script, and R Markdown

R handout Fall 2020 Console R Script R Markdown Page 5 of 18

Suggestion: Choose a Directory (“file path”, “folder”) for Your R Work Why am I making this suggestion? This just makes sense, that’s why! Are you smiling? It’s nice to have all the data management and data analysis work associated with a given project all in one place! Oh my goodness. And while we’re at it, think ahead and choose some sub-directories that make sense to you. It’s nice to have all the data management and data analysis work associated with a given project all in one place! Oh my goodness. Example 1 – MS Thesis/

Ø Source (Idea here: this would contain source data that is “raw” and should not ever be edited) Ø Data Management (Idea here: this would data cleaning, new variable creations, merges, etc) Ø Data Analysis (Idea here: this would contain all your R Studio work to produce descriptives, modeling, etc) Ø Tables Ø Figures Ø Reports Ø Miscellaneous Resources (E.g. – You might keep references that you relied upon for various analyses)

Example 2 – BIOSTATS 540/

Ø Data Wrangling (“wrangling” is the latest word on this!) Ø Summarizing Data Ø Data Visualization Ø One and Two Sample Inference Ø Chi Square Tests Ø Linear Regression

…. You get the idea ….

Page 6: Introduction to R 2020-21 - people.umass.edupeople.umass.edu/biep540w/pdf/R Handout Fall 2020 Console R Scri… · 31/08/2020  · Introduction to R 2020-21 Console, R Script, and

R Handout 2020-21 Introduction to the Console, R Script, and R Markdown

R handout Fall 2020 Console R Script R Markdown Page 6 of 18

2. Tips: Working Directly in the Console Pane

Introduction Sometimes, you will want to do some “quick work” Commands typed into the console pane are executed immediately. The console pane is one of the R Studio windows. It is typically located as the lower left window:

__1. > is the prompt that tells you R Studio is waiting for you to enter a command (code) __2. At the upper right, you can minimize or maximize. __3. Commands typed into the console are executed immediately. R Studio will return either:

- result (for example the result of the calculation of 2+2); - nothing (you will get no response if you simply create an object); - error message (join the club); or - + (the symbol “+” is telling you that your command was incomplete. R Studio wants the rest)

__4. Comments (R Studio ignores), begin with a #. For example - # This is a comment.

Tips - 1 Use the console pane for quick tasks only. Quick tasks are things that you do not need to save. Or they

might be commands that you want to debug. As a general strategy, it’s a better habit to work from a R Script file or an R Markdown. More on this later.

2 Use the console pane as a giant calculator 3 Use the UP and DOWN arrows to cycle backward and forward among the commands. This is especially

handy when you want to correct a command that has an error in it. 4 Control-l (here, this is the letter “el”) to clear the console window.

Don’t worry. Your history of commands is not lost.

Page 7: Introduction to R 2020-21 - people.umass.edupeople.umass.edu/biep540w/pdf/R Handout Fall 2020 Console R Scri… · 31/08/2020  · Introduction to R 2020-21 Console, R Script, and

R Handout 2020-21 Introduction to the Console, R Script, and R Markdown

R handout Fall 2020 Console R Script R Markdown Page 7 of 18

3. Suggested Practice: Work with R Scripts (Source Pane) Introduction For the most part, you will be doing work related to a course or a project. Depending, you will want to work with R scripts. Or you will want to work with R Markdown files. Here we focus on R scripts. An R script file is one type of source file that is available to you in the source pane. Heads up - When you launch R Studio for the first time, you may not even see the source pane (window). How to Access the Source Pane At the upper left, click on either

The “+” symbol > R Script FILE > NEW FILE > R Script

Clicking on R Script opens a new R script file. R script files have the extension “.R”

Page 8: Introduction to R 2020-21 - people.umass.edupeople.umass.edu/biep540w/pdf/R Handout Fall 2020 Console R Scri… · 31/08/2020  · Introduction to R 2020-21 Console, R Script, and

R Handout 2020-21 Introduction to the Console, R Script, and R Markdown

R handout Fall 2020 Console R Script R Markdown Page 8 of 18

Suggestion #1 - Create a permanent “boiler” plate R Script file. Handy! This will then be available to you in every R Studio session. For example, mine is organized into sections:

# R Script File name: R Fragments 640.R # Last Update: 8/31/2020 # ------------------------------------------------------------------------- # 0. HOW TO - Upgrade R without losing packages # 1. Reading in data. Saving to Rdata. List the data. # 2. Working w variables, observations, data cleaning, etc. # 3. Probability Distribution Calculations # 4. Numerical descriptives # 5. Graphs # 6. One and Two Sample Tests - Continuous Outcomes # 7. Contingency Table Tests and Epi Tables # 8. Simple Linear Regression # 9. Multiple Linear Regression & Regression Diagnostics # 10. Logistic Regression # 11. Survival Analysis # 12. Miscellaneous # -------------------------------------------------------------------------

Example, continued - And here is a little snippet from that R Script file

# _____________________________________________________________________________ # 6. One and Two Sample Tests - Continuous Outcomes # _____________________________________________________________________________ ##### Preliminary - Turn of display of scientific notation options(scipen=1000) ##### 6a) Shapiro-Wilk test of normality DescTools::ShapiroFranciaTest(DF$VARIABLE) ##### 6b) One Sample t-test of Null: mu=950 t.test(DF$VARIABLE,mu=950) ##### 6c) One sample CI for Mean Using t-distribution (3 solutions) t.test(DF$VARIABLE, conf.level=0.95) DescTools::MeanCI(DF$VARIABLE, conf.level=0.95) Rmisc::CI(DF$VARIABLE, ci=0.95)

Page 9: Introduction to R 2020-21 - people.umass.edupeople.umass.edu/biep540w/pdf/R Handout Fall 2020 Console R Scri… · 31/08/2020  · Introduction to R 2020-21 Console, R Script, and

R Handout 2020-21 Introduction to the Console, R Script, and R Markdown

R handout Fall 2020 Console R Script R Markdown Page 9 of 18

Suggestion #2 – Depending, you might want to do your work in a R script file. Step 1 - Open a new R script file FILE > NEW FILE > R Script This will create a new script file called untitled1 Step 2 – Before typing anything, save to a new name Step 3 – Type your commands into your R script file Now you have a repository of your commands which can then be recycled, edited, and re-used. You’ll find this is very handy. IMPORTANT – Commands typed into an R script file are not executed. You must send your commands to be executed Step 4 – Send your commands to be executed! There are multiple ways to do this.

METHOD I (not recommended) Select > EDIT-COPY > Paste into the console METHOD II (best for single command) Position your cursor anywhere in the command. Then do a <control- enter> METHOD III (use for multiple commands) Highlight to select the entire command (or multiple commands). Then at upper right, click on RUN

Step 5 – All done? Clean and save. - Tidy up your R script file work (get rid of all the commands that produced errors) - Annotate with lots of comments (recall – comments begin with #) - Save.

Tip

Make it a habit to annotate with comments! Comments begin with the hashtag character # R ignores these.

Page 10: Introduction to R 2020-21 - people.umass.edupeople.umass.edu/biep540w/pdf/R Handout Fall 2020 Console R Scri… · 31/08/2020  · Introduction to R 2020-21 Console, R Script, and

R Handout 2020-21 Introduction to the Console, R Script, and R Markdown

R handout Fall 2020 Console R Script R Markdown Page 10 of 18

4. Suggested Practice: Archive Your Work with R Markdown (Source Pane) Introduction Again, for the most part, you will be doing work related to a course or a project. R Markdown files are a terrific “practice” for archiving text, code, and output. Like R script files, an R Markdown file is one type of source file that is available to you in the source pane. Briefly, here is how your session will go …..

1. Begin your R-Studio session by opening an empty R Markdownfile that will contain your future code and output;

2. Chunk by chunk, navigate between writing code, fixing code, and executing code; 3. All done? Save (archiving) your work (nifty either as a record of your work, or for re-use later!)

Step 1 (One time) – Install the package “knitr” See again page 2.

Step 2 - Begin your R-Studio Session by Opening a new (empty) R Markdown file

__1) From the top menu bar: FILE > NEW FILE > R Markdown __2) You should see something like the following (note – yours won’t say Carol Bigelow of course):

• At top right, at title: Type in a title of your choosing • Just below, under Default Output Format: choose your output format

o HTML – This is the default selection. It’s fine to choose this. • Click OK

Page 11: Introduction to R 2020-21 - people.umass.edupeople.umass.edu/biep540w/pdf/R Handout Fall 2020 Console R Scri… · 31/08/2020  · Introduction to R 2020-21 Console, R Script, and

R Handout 2020-21 Introduction to the Console, R Script, and R Markdown

R handout Fall 2020 Console R Script R Markdown Page 11 of 18

Example –

Key: * R Markdown has provided you with a “shell” with, seemingly, a bunch of stuff! * Each gray shaded area is called a chunk. A chunk is a set of R commands with a “beginning” and “end” CHUNK BEGINNING: Each “chunk” begins with ```{r} or it begins with ```{r SOMETHING YOU CHOOSE HERE} IF you choose ```{r include=FALSE} THEN messages and code will be NOT SHOWN (I do not recommend this) If you choose ```{r echo=FALSE} THEN code will NOT BE SHOWN (I do not recommend this either) Personally, I recommend sticking with beginning each “chunk” using ```{r } CHUNK END: Each “chunk” ends with ``` __3) Clear this “shell” - Place your cursor at line 7 of the “shell” R Markdown. Drag to highlight and select all below - Click delete

Page 12: Introduction to R 2020-21 - people.umass.edupeople.umass.edu/biep540w/pdf/R Handout Fall 2020 Console R Scri… · 31/08/2020  · Introduction to R 2020-21 Console, R Script, and

R Handout 2020-21 Introduction to the Console, R Script, and R Markdown

R handout Fall 2020 Console R Script R Markdown Page 12 of 18

You should now see something like the following (with your name, not mine, obviously):

__4. Chunk by chunk, navigate between writing code, fixing code and executing code What we will do is to cycle, as follows:

1st – We open a new blank chunk (to do a specific task that we want to do) 2nd - We type some commands into this chunk and then we run it. 3rd - As, typically is the case, we EDIT the commands in this chunk until we get what we like and then re-run it 4th - Once, we’re happy with the current chunk and the current task, we move on to the next chunk/next task!

1st – How to Open a New Blank Chunk Click on the little green “insert a chunk” icon at top (on the right). From the drop down menu, choose R

You should see the following. - You should see the gray chunk start ```{r} - You should see your cursor placed inside - You should see the chunk end ``` - TIP - Do NOT edit or delete the chunk start or end!

2nd – Type Some Commands in the Chunk and Run it

Page 13: Introduction to R 2020-21 - people.umass.edupeople.umass.edu/biep540w/pdf/R Handout Fall 2020 Console R Scri… · 31/08/2020  · Introduction to R 2020-21 Console, R Script, and

R Handout 2020-21 Introduction to the Console, R Script, and R Markdown

R handout Fall 2020 Console R Script R Markdown Page 13 of 18

For Example, here I’ve typed some commands

How to Run/Execute the commands in the chunk Use your cursor to select the lines or chunk that you want to run/execute. From the Run drop down at top right, make your selection

You should see your output/results BELOW, in the CONSOLE window. Error messages appear here, too

Page 14: Introduction to R 2020-21 - people.umass.edupeople.umass.edu/biep540w/pdf/R Handout Fall 2020 Console R Scri… · 31/08/2020  · Introduction to R 2020-21 Console, R Script, and

R Handout 2020-21 Introduction to the Console, R Script, and R Markdown

R handout Fall 2020 Console R Script R Markdown Page 14 of 18

3rd - As, typically is the case, we EDIT the commands in this chunk until we get what we like and then re-run it

That’s right. Most of the time, you will tinker with a chunk over and over (either to fix a mistake or to obtain a better solution to the task at hand). To do this:

- Position your cursor in the link of the chunk that you want to edit. - Edit, and re-execute

HACK! Is the stuff in the console window getting to be too much? Want to clear it? Position your cursor at the <- prompt in the CONSOLE window Click control-L to clear.

4th - Once, we’re happy with the current chunk and the current task, we move on to the next chunk/next task! Repeat 1st, 2nd’ and 3rd for each chunk/task you want to do.

__3. All done? Save (archiving) your work (nifty either as a record of your work, or for re-use later!)

The action of saving your work is what is meant by knit. How to knit:

- At top click on the drop down menu for the knit icon - From the drop down menu, I recommend that you choose KNIT TO WORD

(Why? Answer – so that you can open this file later in word and perhaps fancy it up a bit) - Tip: Take care to choose a destination folder that you’ll remember (I always choose DESKTOP)

Page 15: Introduction to R 2020-21 - people.umass.edupeople.umass.edu/biep540w/pdf/R Handout Fall 2020 Console R Scri… · 31/08/2020  · Introduction to R 2020-21 Console, R Script, and

R Handout 2020-21 Introduction to the Console, R Script, and R Markdown

R handout Fall 2020 Console R Script R Markdown Page 15 of 18

One more thing. Save the R Markdown file itself! This has extension “.Rmd” (Handy if you want to be able to re-use it)

Huzzah. You will now have TWO files: - A R Markdown file (“.Rmd”) - An MS Word file (“.docx”)

Voila! This is what my MS Word file looks like

Untitled

CarolBigelow

1/29/2020

# Comments begin with a # # Illustration: Create an object that is named weight that is a column vecor with 4 entries # KEY # objectname <- c(value,value,value,value) weight <- c(161.3, 120.1, 223.2, 124.0) # Illustration: List (Echo) the object just created by typing the name of the object # Key # objectname weight

## [1] 161.3 120.1 223.2 124.0

Page 16: Introduction to R 2020-21 - people.umass.edupeople.umass.edu/biep540w/pdf/R Handout Fall 2020 Console R Scri… · 31/08/2020  · Introduction to R 2020-21 Console, R Script, and

R Handout 2020-21 Introduction to the Console, R Script, and R Markdown

R handout Fall 2020 Console R Script R Markdown Page 16 of 18

5. Your Turn Illustration

__1. Download toy.Rdata and place it on your desktop The dataset toy.Rdata is a tiny excel dataset that is on the course website. RIGHT click to download. Place it on your desktop. __3. Launch R Studio What could go wrong: You launch R instead of launching R Studio. __4. Set your working directory to your desktop. Check that all is well. What could go wrong – All: (1) You forgot to enclose the file path in quotes What could go wrong – WINDOWS Users: (1) You need to fix the leading “c” to be lower case (2) You may need to fix the slashes to be either FORWARD or DOUBLE BACKWARD. Single backward will NOT work. Here is what I have. Your file path will be different than mine (yellow highlight): > # Set working directory to desktop > setwd("/Users/cbigelow/Desktop") > # Show working directory to verify that all is well > getwd() [1] "/Users/cbigelow/Desktop" >

__5. Load the dataset toy.Rdata Type into your R Script file EXACTLY as shown here

load(file=”toy.Rdata”)

Page 17: Introduction to R 2020-21 - people.umass.edupeople.umass.edu/biep540w/pdf/R Handout Fall 2020 Console R Scri… · 31/08/2020  · Introduction to R 2020-21 Console, R Script, and

R Handout 2020-21 Introduction to the Console, R Script, and R Markdown

R handout Fall 2020 Console R Script R Markdown Page 17 of 18

__9. Produce a relative frequency table for a discrete variable. Package is DescTools. Command is Freq() What could go wrong: (1) You forgot to install the package DescTools. Solution: Go back to page 1 (2) You forget to issue the command library( ). Solution: library(DescTools) (3) You made a mistake in keeping track of upper and lower case. R Studio is case sensitive! In your R script file, type and then execute the following # Discrete Variable - Frequency/Relative Frequency Table library(DescTools) options(scipen=100) Freq(toyrdata$temp)

You should now see below, in your console pane the following > # Discrete Variable - Frequency/Relative Frequency Table > library(DescTools) > options(scipen=100) > Freq(toyrdata$temp) level freq perc cumfreq cumperc 1 [1,1.5] 4 33.3% 4 33.3% 2 (1.5,2] 4 33.3% 8 66.7% 3 (2,2.5] 0 0.0% 8 66.7% 4 (2.5,3] 4 33.3% 12 100.0%

Page 18: Introduction to R 2020-21 - people.umass.edupeople.umass.edu/biep540w/pdf/R Handout Fall 2020 Console R Scri… · 31/08/2020  · Introduction to R 2020-21 Console, R Script, and

R Handout 2020-21 Introduction to the Console, R Script, and R Markdown

R handout Fall 2020 Console R Script R Markdown Page 18 of 18

__10. Produce descriptive statistics for a continuous variable. Package is stargazer. Command is stargazer() What could go wrong: same as above, namely: (1) You forgot to install the package stargazer. Solution: Go back to page 1 (2) You forget to issue the command library( ). Solution: library(stargazer) (3) You made a mistake in keeping track of upper and lower case. R Studio is case sensitive! In your R script file, type and then execute the following (yellow highlights are mine) # Continuous Variable – Descriptives # KEY # stargazer(DFNAME[c("varname")],type="text",summary.stat=c("n", "mean", "sd", "min", "p25", "median", "p75", "max")) library(stargazer) stargazer(toyrdata[c("growth")],type="text",summary.stat=c("n", "mean", "sd", "min", "p25", "median", "p75", "max"))

You should now see below, in your console pane the following