Getting started with R
Jacob van Etten
What is R?
“R is a free software environment for statistical computing and graphics.”
www.r-project.org
Why R?
R takes time to learn and use.
So why should I bother?
There are more user-friendly programmes, right?
12 reasons to learn R
1. Rigour and strategy in data analysis – not “thinking after clicking”.
2. Automatizing repeated calculations can save time in the end.
3. A lot of stuff is simply not feasible in menu driven software.
12 reasons to learn R
4. R is not only about software, it is also an online community of user support and collaboration.
5. Scripts make it easy to communicate about your problem. Important for collaborative research!
6. Research becomes replicable when scripts are published.
12 reasons to learn R
7. R packages represent state-of-the-art in many academic fields.
8. Graphics in R are very good.
9. R stimulates learning – graduation from user to developer.
12 reasons to learn R
10. R is free, which saves you money. Or R may be the only option when budgets are restricted.
11. R encourages to freely explore new methods and learn about them.
12. Knowing to work with R is a valuable and transferable skill.
It’s the long way but it’s worth it...
Resources to learn R My two picks for you
http://pj.freefaculty.org/R/Rtips.html
Some R packages of interest
RNCEP – Reanalysis data
clim.pact – Downscaling climate data
GhcnDaily – daily weather data
weatherData (R-Forge) – Daily weather data and derived bioclimatic variables relevant to plant growth
raster – gives access to WorldClim data
And a lot more here...
http://cran.r-project.org/other-docs.html
Downloading R
Choose a mirror nearby and then...
Binaries“When downloading, a completely functional program without any installer is also often called program binary, or binaries (as opposed to the source code).”(Wikipedia)
And finally,you can download R...
When downloading has finished, run the installer
The bare R interface
RStudio makes life easier
Rstudio.org
Four parts
Scripts, documentation
Console
Files, plots, packages and help
Workspace/history
Create a new R script: File – New – R Script
Our first code...
Type
1 + 1
into the script area.
Then click “Run”.
What happens?
Exercises: running code
Type a second line with another calculation (use “-”, “/”, or “*”) and click “Run” again.
Select only one line with the mouse or Shift + arrows. Then click “Run”.
Save your first code to a separate folder “Rexercises”.
Following exercises
In the next exercises, we will develop a script.
Just copy every new line and add it to your script, without erasing the previous part.
If you want to make a comment in your script, put a # before that line. Like this:
#important to remember: use # to comment
If the exercises are a bit silly...
...that’s because you are learning.
Vector
Type a new line with the expression
1:10
in the script and run this line.
A concatenation of values is called a vector.
Making a new variable
If we send 1:10 into the console it will only print the outcome. To “store” this vector, we need to do the following.
a <- 1:10
new variable “a” assign vector values 1 to 10
Operations with vectors
Try the following and see what happens.aa * 2a * ab <- a * abprint(b)
Other ways of making vectors
d <- c(1, 6, 9)dclass(d)f <- LETTERSfclass(f)
What is the difference between d and f?
Functions
Actually, we have already seen functions!Functions consist of a name followed by one or more arguments. Commas and brackets complete the expression.
class(f)c(d,f)
name argument
Cheat sheet
When you use R, you will become familiar with the most common functions.
If you need a less common function, there are ways to discover the right one.
For now, use the cheat sheet to look up the functions you need.
Getting help on functions
This will open help pages for the functions in your browser.
?c?class
Especially the examples are often helpful. Just copy and paste the code into the console and see with your own eyes what happens!
Matrices
We have already met the vector.If we put two or more vector together as columns, we get a matrix.X <- c(1,2,3)Y <- c(8,9,7)Z <- c(4,2,8)M <- cbind(X, Y, Z)How many columns and rows does M have?
Data frames
Matrices must consist of values of the same class. But often datasets consist of a mix of different types of variables (real numbers and groups). This is the job of data frames.
L <- c(“a”, “b”, “c”)Df <- data.frame(X,Y,Z,L)
Visualize Df like this: str(Df)
What would happen if you tried to make a matrix out of these same vectors instead? Try and see.
Getting data into R
?read.csv
CSV files are a relatively trouble-free way of getting data into R.
It is a fairly common format.
You can make a CSV file in any spreadsheet software.
Create a CSV fileFirst name Family name Sex Age
John Travolta Male 57
Elijah Wood Male 30
Nicole Kidman Female 44
Keira Knightley Female 26
Add your own favorite actor, too.
Open the file with Notepad.
Make sure the values are separated by commas.
Now use R to read it
Now read it into R.actors <- read.csv(yourfile.csv)str(actors)
Subsetting
There are many ways of selecting only part of a data frame. Observe carefully what happens.actors[1:2,]actors[,1:2]actors[“Age”]actors[c(“Name”, “Age”)]subset(actors, Age> 40)Now create a new data frame with the actors younger than 45.
Graphics
The plot function makes graphs.
plot(actors[c(“sex”, “Age”)])
Summary
You now know about:
VariablesFunctionsVectorsMatricesData framesGetting tabular data into RSubsettingSimple plotting
Time for your first fight...
Top Related