R Programming for Music Informatics Donald Byrd rev. 21 March 2008 Copyright © 2006-08, Donald...
-
Upload
victor-tyler -
Category
Documents
-
view
214 -
download
2
Transcript of R Programming for Music Informatics Donald Byrd rev. 21 March 2008 Copyright © 2006-08, Donald...
R Programming for Music Informatics
Donald Byrdrev. 21 March 2008
Copyright © 2006-08, Donald Byrd
30 Jan. 08 2
Intrroducing R
• R is very interactive: instead of programming, can use as powerful graphing calculator
• => easier to experiment with & learn, & useful that way• R was originally designed for statistics• Why R?
– easy to do simple things with it– easy to do many fairly complex things, incl. graphs &
handling audio files• probably not good for really complex programs
– free, & available for all popular operating systems– very interactive => easy to experiment– has good documentation– In use in other Music Informatics classes, & standardizing is
good
30 Jan. 08 3
Getting started with R
• To get R– Web site: http://cran.us.r-project.org/– Has lots of documentation (tutorials, manuals, etc.), too…
though most isn’t for beginners– Versions for Linux, Mac OS X, Windows– On all(?) STC computers
• Tutorial:• http://www.informatics.indiana.edu/donbyrd/Teach/RTools+Do
cs//R_tutorial_DAB.txt• Can use R interactively as a powerful graphing, musicing, etc.
calculator• …but it’s not perfect: sometimes very cryptic
27 Nov. 07 4
Programming in General (1)• Details are often vital (& errors are costly)
– A great many details really are. Commonly:• Quote marks, including single vs. double• Capitalization
– “Wav” & “wav” are different– TIP: “steal” as much as possible!
• Via Copy & Paste is ideal: avoids typos
• Programs tend to be very hard to understand– TIP: include useful, readable comments– TIP: choose variable names for clarity
• “wavdata” isn’t good; how about “samples”?– TIP: consistency helps clarity and correctness
• Don’t mix “v = expr”, “v <- expr”, and “expr -> v”• Use the same variable name for something in every prog.
• Program defensively
rev. 3 Apr. 07 5
Programming in General (2)
• Comments– Classic example of a bad comment
• x <- x+1 # add 1 to x– Doesn’t explain anything!
• Good commenting style (thanks to Ed Wolf)# Using the Add Sines Demo, create and play a wave at G3,# then do the same for a wave at 5/4 this frequency. Finally, # normalize the sum of the two waves and listen to result.…
# create and play first sound wavesndW <- sine(f, duration=secs, samp.rate=sr, bit=16, xunit="time")
play(sndW )…
3 April 2007 6
Programming in General (3)• Block comments (w/ overall description) more
important than comments on single stmts• Ideal: say just the right things: not too
much or too little– Basic principle of all human communication– …including this slide show & music notations (CMN,
tablature, etc.)– …and comments in a program
• Other aspects of formatting & style– Variable names
• Choose variable names for clarity• camelCase is helpful
– Space around operators– “v <- f(expr)”, not “v<-f(expr)”
rev. 30 Jan. 08 7
Programming in R (1)• R offers to save workspace when you quit
– Are you sure it’s what you want?– TIP: Just say no.
• Can restore original with ‘load(".Rdata")’ or menu command
– TIP: Use a text editor & files to save work• If real text editor (not word processor) file, can run with R
“source” command• Regardless, can Copy & Paste, even just part of file
• setwd() to correct path for your computer– Depends on where you have files– Can be tricky, esp. in Windows
• Typical Windows ex.: setwd("C:/Documents and Settings/donbyrd.ADS/Teaching/N560")
• On Mac (& Windows?), can use “~/Teaching/N560"• …or drag & drop• …or use R GUI “Change Working Directory” menu command!
rev. 14 Jan. 08 8
Programming in R (2)
• R has many useful built-in functions– Many of them handle vectors (no loop needed)
• diff(v): vector of consecutive differences• sum(v): sum of vector elements
– Random numbers with various distributions: runif (uniform), rnorm (normal), etc.
– read.table, table (and related functions)– fft– tuneR adds sine, square, noise, bind, mono, etc.
• R (and tuneR) have excellent on-line help– Type either ‘help(sine)’ (e.g.) or ‘?sine’
• …but NB: sometimes need ‘help("sine")’– TIP: Copy & Paste from help window!– Caveat: terminology is statistics oriented
rev. 30 Jan. 08 9
Programming in R (3)• Besides built-in, functions can be user-written
– Hard for many beginners; why?– Probably mostly confusion about variables (including
parameters & return values)
• A simple but realistic example# Convert MIDI note number to frequency in Hertz.
MIDINum2Freq <- function(noteNum) { freq <- 440*2^((noteNum-69)/12) return(freq)}
• Calling it– fr <- MIDINum2Freq(57) # Sets fr = 220– Inside function, parameter noteNum = 57, freq = 220; fr
doesn’t exist (it’s out of scope)– Outside function, noteNum & freq don’t exist
rev. 30 Jan. 08 10
Programming in R (4)• Introducing loops
– Loops also hard for many beginners– Main reason is probably confusion re control variable– A very simple (though pointless) example
• mnnV <- 1:6 # make mnV a 6-place vector• mnnV # see what mnnV is before loop
• for (n in 1:6) {• mnnV[n] <- n+59• }• mnnV # ...and after
– Instead of “in 1:6”, can use any vector!– n (control variable) doesn’t exist outside the loop– C, Perl, etc. users can put the vector in the “for”
• for (n in seq(1, 6)) { …– Loop is a type of control statement
27 Nov. 07 11
Software Engineering & Debugging (1)
• Experience: all complex programs have bugs– Judge in Florida e-voting case: claim that voting
machine software was buggy is speculation– True, but… !
• Disclaimer: I don’t know any hard evidence
• Expect bugs & program defensively• True stories
– The program that failed only on Wednesdays! Why?• Hint: “Wednesday” has 9 characters
– Weeks of debugging to find a “1” that should have been “i”
27 Nov. 07 12
Software Engineering & Debugging (2)
• Good engineering (design, coding, comments, etc.) => less debugging & more robust (reliable & flexible) programs
• Don’t underengineer• …but don’t overengineer, either!• Underengineering is much bigger danger for
inexperienced programmers• Main factors
– Complexity of problem– Is program or code it includes likely to be used for
very long?• If so, how expert are future programmers likely to
be?
27 Nov. 07 13
Software Engineering & Debugging (3)• Standard technique: zero in on problem code• Debug on short/simple cases, not long/complex
ones– Makes it practical to look at results of several print
statements– Reduces or eliminates long delays to see results– “short/simple” often means simply not much data– Can easily reduce days of debugging to hours
• Usually easy to do by turning lots of data into a little data– Real situation: nThemes <- 3500, or 20 sec. audio file– For testing: use nThemes <- 4 (say), or 1 sec. audio– Caveat! the “little” data may not show the bug– …and if bug results from a design problem, fixing it
may be very hard
31 Jan. 08 14
Debugging in R (1)
• Basic technique: zero in on bug with print or cat– E.g., before & after doing something questionable
• print(c("max before scaling=", max(notesW@left)))• wNotes <- wNotes*2.5• cat("max after scaling=", max(notesW@left), “\n”)
– cat merges its arguments, gets rid of the extra parens– …but doesn’t end the line => do it yourself with “\n”
– If you use “source” (& inside loops?), just naming variable doesn’t work; must use print or cat
• A variation: use plot instead of print/cat– The right picture is worth 10,000 words; the wrong
one, zero (cf. Tufte on the Challenger disaster)– …but the right picture for debugging is often simple &
obvious
8 Sept. 07 15
Debugging in R (2)
• More advanced technique: use a good debugger– Allows setting breakpoints, looking at variables, etc.,
while program is running– Especially helpful w/ complex programs– …or learning a new language– To some extent, R’s interactivity accomplishes same
thing
• R has a debugger– One student (an experienced programmer) tried &
liked it! Anyone else?
1 Feb. 08 16
Dangers of R (1)• More danger of nasty bugs in R than many
programming languages & environments– No explicit types => can’t warn of questionable usage– No variable declarations => catches fewer typos (only a
problem in old versions of R?)– Both above like Perl (e.g.), but Java (e.g.) is great on both
=> Java programmers likely to be careless!
• Defensive programming– E.g., add “sanity checks” as you work, use conventions for
variable names, etc.– Always important: a subtle bug can waste a huge amount
of time and/or money• Ex: weeks of debugging to find a “1” that should have been “i”
• Ex: period instead of comma => missile had to be destroyed – …but especially in dangerous environments like R
rev. 23 Feb. 08 17
Dangers of R (2)
• “Gotchas” in R (all from real life)– Surprising operator precedence, esp. in “for”
statement• In sets, need parentheses to get addition before “:”• E.g., say “start:(start+5)”, not “start:start+5” !
– “;” is usually ignored, but not always– Line break sometimes starts a new statement, but
not always• cf. “LineBreaksInRStatements.r” example
– Referring to a column of a table different ways gives same data but can behave very differently
• “noteTbl$Cum.time” & “noteTbl[,1]” are vectors of integers; “noteTbl[1]” is a list
rev. 24 Feb. 08 18
Dangers of R & tuneR• Other real-life examples from Don’s classes
– Undeclared variable: “allNotes” vs. “allnotes” (only a problem in old versions of R?)
– Call a function that returns a value but ignore the value
• Danger much worse because R & tuneR often gives lousy feedback for errors or likely errors– tuneR square & sawtooth functions fail w/o error message
if frequency isn’t an integer—and the manual doesn’t say it has to be an integer!
– Exception: tuneR play w/ unnormalized values => very helpful error message
– Nonexistent named params. sometimes give error, not always
rev. 30 Jan. 08 19
Programming in R with tuneR
• On OS X (and LINUX): play() problem– Must say what program to use to play Waves
• Either setWavPlayer once, or add 2nd param. to each play()
– OS X can use QuickTime Player• It’s on every OS X machine, & it works, but…
– Usually gives scary error messages; must hit the escape key to get R to continue; leaves open more & more QuickTime Players. A serious nusiance.
– OS X alternative: playRWave• Works fine, but…
– Not pre-installed; you must get & install it• Available (with instructions) at:
– http://www.informatics.indiana.edu/donbyrd/Teach/Rtools+Docs/
rev. 20 Mar. 08 20
Don’s Coding Conventions (1)• Chris Raphael’s & Don Byrd’s styles are very different
– Partly a matter of taste, partly reflects goals– Consistency & readability are important– Consistency helps clarity and correctness– But flexibility is important too; these are guidelines only!
• Variable names– General: long enuf to be clear, but no longer– Ex.: use “nNotes”, not “noteCount” or just “notes” – “Hungarian notation”: suffix “V” = vector, “W” = wave– Common examples: nNotes, sampleV, noteW, sr
• Operators– Always use “<-” for assignment
• Reason: with “=” for named parameters and “==” for tests, using “=” for assignment is too confusing
rev. 4 Feb. 08 21
Don’s Coding Conventions (2)• Use of whitespace
– Put space before & after assignment operators– Separate parens & curly braces from adjacent things with
space – Put several spaces before, at least one after “#”
• Program organization1. Initial stuff (libraries, etc.), setting “parameters” likely to change2. Definitions of functions3. Main program (calls the functions, if there are any)
• Specific to audio: creating simple waveforms– When possible, use tuneR sine function– Create samples directly only when tuneR sine isn’t
flexible enough (for glissandi, vibrato, other waveforms, etc.)