purrr - files.speakerdeck.com€¦ · blah blah blah ok I admit it: FP not actually front of mind...
Transcript of purrr - files.speakerdeck.com€¦ · blah blah blah ok I admit it: FP not actually front of mind...
![Page 1: purrr - files.speakerdeck.com€¦ · blah blah blah ok I admit it: FP not actually front of mind when I use purrr. what does purrr help me do? iterate in a data-structure-informed](https://reader033.fdocuments.in/reader033/viewer/2022050212/5f5e94e2e4597c463d45d436/html5/thumbnails/1.jpg)
purrr
DRAFT
![Page 2: purrr - files.speakerdeck.com€¦ · blah blah blah ok I admit it: FP not actually front of mind when I use purrr. what does purrr help me do? iterate in a data-structure-informed](https://reader033.fdocuments.in/reader033/viewer/2022050212/5f5e94e2e4597c463d45d436/html5/thumbnails/2.jpg)
DRAFT
https://jennybc.github.io/purrr-tutorial/index.html
these are not slides from a talk!
I refer to them before and during live coding while teaching STAT 545 and DSCI 523
don’t expect them to stand on their own
more material developing here:
![Page 3: purrr - files.speakerdeck.com€¦ · blah blah blah ok I admit it: FP not actually front of mind when I use purrr. what does purrr help me do? iterate in a data-structure-informed](https://reader033.fdocuments.in/reader033/viewer/2022050212/5f5e94e2e4597c463d45d436/html5/thumbnails/3.jpg)
what is purrr?
functional programming
blah blah blah
ok I admit it:
FP not actually front of mind when I use purrr
![Page 4: purrr - files.speakerdeck.com€¦ · blah blah blah ok I admit it: FP not actually front of mind when I use purrr. what does purrr help me do? iterate in a data-structure-informed](https://reader033.fdocuments.in/reader033/viewer/2022050212/5f5e94e2e4597c463d45d436/html5/thumbnails/4.jpg)
what does purrr help me do?
iterate in a data-structure-informed way
tolerate list-columns in data frames
with consistent UI across a large family of fxns
and return values that are ready for further computation
![Page 5: purrr - files.speakerdeck.com€¦ · blah blah blah ok I admit it: FP not actually front of mind when I use purrr. what does purrr help me do? iterate in a data-structure-informed](https://reader033.fdocuments.in/reader033/viewer/2022050212/5f5e94e2e4597c463d45d436/html5/thumbnails/5.jpg)
for every X
do Y
return combined results like Z
![Page 6: purrr - files.speakerdeck.com€¦ · blah blah blah ok I admit it: FP not actually front of mind when I use purrr. what does purrr help me do? iterate in a data-structure-informed](https://reader033.fdocuments.in/reader033/viewer/2022050212/5f5e94e2e4597c463d45d436/html5/thumbnails/6.jpg)
for every X
do Y
return combined results like Z
X and Z will make reference to actual R data structures
Y will be a function, possibly anonymous
like for i in 1 to n … but much higher level
![Page 7: purrr - files.speakerdeck.com€¦ · blah blah blah ok I admit it: FP not actually front of mind when I use purrr. what does purrr help me do? iterate in a data-structure-informed](https://reader033.fdocuments.in/reader033/viewer/2022050212/5f5e94e2e4597c463d45d436/html5/thumbnails/7.jpg)
iterate in a data-structure-informed way
for every GitHub username
do GET https://api.github.com/users/username
and give me HTTP responses in a list
https://jennybc.github.io/purrr-tutorial/ex03_github-api-json.html
![Page 8: purrr - files.speakerdeck.com€¦ · blah blah blah ok I admit it: FP not actually front of mind when I use purrr. what does purrr help me do? iterate in a data-structure-informed](https://reader033.fdocuments.in/reader033/viewer/2022050212/5f5e94e2e4597c463d45d436/html5/thumbnails/8.jpg)
iterate in a data-structure-informed way
for every HTTP response
extract the “name” element
and give me a character vector
https://jennybc.github.io/purrr-tutorial/ex03_github-api-json.html
![Page 9: purrr - files.speakerdeck.com€¦ · blah blah blah ok I admit it: FP not actually front of mind when I use purrr. what does purrr help me do? iterate in a data-structure-informed](https://reader033.fdocuments.in/reader033/viewer/2022050212/5f5e94e2e4597c463d45d436/html5/thumbnails/9.jpg)
iterate in a data-structure-informed way
for every HTTP response
extract the elements "login", "name", "id", "location"
and give me a data frame
https://jennybc.github.io/purrr-tutorial/ex03_github-api-json.html
![Page 10: purrr - files.speakerdeck.com€¦ · blah blah blah ok I admit it: FP not actually front of mind when I use purrr. what does purrr help me do? iterate in a data-structure-informed](https://reader033.fdocuments.in/reader033/viewer/2022050212/5f5e94e2e4597c463d45d436/html5/thumbnails/10.jpg)
iterate in a data-structure-informed way
for every row in a data frame
create a MIME object
and give me a list
https://jennybc.github.io/purrr-tutorial/ex20_bulk-gmail.html
![Page 11: purrr - files.speakerdeck.com€¦ · blah blah blah ok I admit it: FP not actually front of mind when I use purrr. what does purrr help me do? iterate in a data-structure-informed](https://reader033.fdocuments.in/reader033/viewer/2022050212/5f5e94e2e4597c463d45d436/html5/thumbnails/11.jpg)
iterate in a data-structure-informed way
for every MIME object
send an email
and return send status as a list
https://jennybc.github.io/purrr-tutorial/ex20_bulk-gmail.html
![Page 12: purrr - files.speakerdeck.com€¦ · blah blah blah ok I admit it: FP not actually front of mind when I use purrr. what does purrr help me do? iterate in a data-structure-informed](https://reader033.fdocuments.in/reader033/viewer/2022050212/5f5e94e2e4597c463d45d436/html5/thumbnails/12.jpg)
iterate in data-structure-informed way
for every tuple (string, pos of substring starts, pos of substring ends)
extract the substrings
and give me a list of character vectors
https://jennybc.github.io/purrr-tutorial/ex10_trump-tweets.html
![Page 13: purrr - files.speakerdeck.com€¦ · blah blah blah ok I admit it: FP not actually front of mind when I use purrr. what does purrr help me do? iterate in a data-structure-informed](https://reader033.fdocuments.in/reader033/viewer/2022050212/5f5e94e2e4597c463d45d436/html5/thumbnails/13.jpg)
inspectquerymodify
![Page 14: purrr - files.speakerdeck.com€¦ · blah blah blah ok I admit it: FP not actually front of mind when I use purrr. what does purrr help me do? iterate in a data-structure-informed](https://reader033.fdocuments.in/reader033/viewer/2022050212/5f5e94e2e4597c463d45d436/html5/thumbnails/14.jpg)
inspectstr() str(my_list, max.level = 1) str(my_list[[i]], list.len = 10) listviewer::jsonedit()
![Page 15: purrr - files.speakerdeck.com€¦ · blah blah blah ok I admit it: FP not actually front of mind when I use purrr. what does purrr help me do? iterate in a data-structure-informed](https://reader033.fdocuments.in/reader033/viewer/2022050212/5f5e94e2e4597c463d45d436/html5/thumbnails/15.jpg)
map(.x, .f, ...)
![Page 16: purrr - files.speakerdeck.com€¦ · blah blah blah ok I admit it: FP not actually front of mind when I use purrr. what does purrr help me do? iterate in a data-structure-informed](https://reader033.fdocuments.in/reader033/viewer/2022050212/5f5e94e2e4597c463d45d436/html5/thumbnails/16.jpg)
map(.x, .f, ...).x is a vector
“for every X” = for every element of .x
remember lists are vectors
remember data frames are lists
![Page 17: purrr - files.speakerdeck.com€¦ · blah blah blah ok I admit it: FP not actually front of mind when I use purrr. what does purrr help me do? iterate in a data-structure-informed](https://reader033.fdocuments.in/reader033/viewer/2022050212/5f5e94e2e4597c463d45d436/html5/thumbnails/17.jpg)
map(.x, .f, ...).f is a function
possibly specified with shortcuts
all shown in the worked examples
“do Y” = .f(.x[[i]], …)
![Page 18: purrr - files.speakerdeck.com€¦ · blah blah blah ok I admit it: FP not actually front of mind when I use purrr. what does purrr help me do? iterate in a data-structure-informed](https://reader033.fdocuments.in/reader033/viewer/2022050212/5f5e94e2e4597c463d45d436/html5/thumbnails/18.jpg)
“give me a Z”
map(.x, .f, …) can be thought of as map_list(.x, .f, …)
![Page 19: purrr - files.speakerdeck.com€¦ · blah blah blah ok I admit it: FP not actually front of mind when I use purrr. what does purrr help me do? iterate in a data-structure-informed](https://reader033.fdocuments.in/reader033/viewer/2022050212/5f5e94e2e4597c463d45d436/html5/thumbnails/19.jpg)
“give me a Z”
map_lgl(.x, .f, ...) map_chr(.x, .f, ...) map_int(.x, .f, ...) map_dbl(.x, .f, …) return an atomic vector of requested type
![Page 20: purrr - files.speakerdeck.com€¦ · blah blah blah ok I admit it: FP not actually front of mind when I use purrr. what does purrr help me do? iterate in a data-structure-informed](https://reader033.fdocuments.in/reader033/viewer/2022050212/5f5e94e2e4597c463d45d436/html5/thumbnails/20.jpg)
“give me a Z”
map_df(.x, .f, ..., .id = NULL) basically: map() then dplyr::bind_rows()
![Page 21: purrr - files.speakerdeck.com€¦ · blah blah blah ok I admit it: FP not actually front of mind when I use purrr. what does purrr help me do? iterate in a data-structure-informed](https://reader033.fdocuments.in/reader033/viewer/2022050212/5f5e94e2e4597c463d45d436/html5/thumbnails/21.jpg)
“give me a Z”
walk(.x, .f, …) can be thought of as map_nothing(.x, .f, …)
![Page 22: purrr - files.speakerdeck.com€¦ · blah blah blah ok I admit it: FP not actually front of mind when I use purrr. what does purrr help me do? iterate in a data-structure-informed](https://reader033.fdocuments.in/reader033/viewer/2022050212/5f5e94e2e4597c463d45d436/html5/thumbnails/22.jpg)
“for every X”
map2(.x, .y, .f, …) X = (element i of .x, element i of .y)
pmap(.l, .f, …) X = tuple of the i-th elements of the lists in .l
remember a data frame is a list!
![Page 23: purrr - files.speakerdeck.com€¦ · blah blah blah ok I admit it: FP not actually front of mind when I use purrr. what does purrr help me do? iterate in a data-structure-informed](https://reader033.fdocuments.in/reader033/viewer/2022050212/5f5e94e2e4597c463d45d436/html5/thumbnails/23.jpg)
how might you be such things today?
maybe you don’t, because you don’t know how 😔
for loops
apply(), [slvmt]apply(), split(), by()
the plyr package: [adl][adl_]ply()
with dplyr: df %>% group_by() %>% do()
![Page 24: purrr - files.speakerdeck.com€¦ · blah blah blah ok I admit it: FP not actually front of mind when I use purrr. what does purrr help me do? iterate in a data-structure-informed](https://reader033.fdocuments.in/reader033/viewer/2022050212/5f5e94e2e4597c463d45d436/html5/thumbnails/24.jpg)
this is not my first R rodeo
I have gone through intense, evangelical phases of iterating with base “apply” functions and plyr
I highly recommend you give purrr a try
![Page 25: purrr - files.speakerdeck.com€¦ · blah blah blah ok I admit it: FP not actually front of mind when I use purrr. what does purrr help me do? iterate in a data-structure-informed](https://reader033.fdocuments.in/reader033/viewer/2022050212/5f5e94e2e4597c463d45d436/html5/thumbnails/25.jpg)
relationship to base R approaches
there’s nothing you can do with purrr that you cannot do with base
specifically: map() is basically lapply()
main reasons to use purrr:
- shortcuts facilitate anonymous functions for .f
- greater encouragement for type-safety
- consistent API across large family of functions
![Page 26: purrr - files.speakerdeck.com€¦ · blah blah blah ok I admit it: FP not actually front of mind when I use purrr. what does purrr help me do? iterate in a data-structure-informed](https://reader033.fdocuments.in/reader033/viewer/2022050212/5f5e94e2e4597c463d45d436/html5/thumbnails/26.jpg)
tolerate list-columns in data frames
tidyverse lifestyle ~ work in a data frame when possible
what about stuff that can’t be stored as an atomic vector? - stick it in a list-column
but list-columns are awful! - get better at inspecting lists - get better at computing on lists
use purrr::map() and friends - probably inside dplyr::mutate()
![Page 27: purrr - files.speakerdeck.com€¦ · blah blah blah ok I admit it: FP not actually front of mind when I use purrr. what does purrr help me do? iterate in a data-structure-informed](https://reader033.fdocuments.in/reader033/viewer/2022050212/5f5e94e2e4597c463d45d436/html5/thumbnails/27.jpg)
tolerate list-columns in data frames
tidyverse lifestyle ~ work in a data frame when possible
ok there’s a whole section I want to write here, with more worked examples on the site, etc.
but that’s not happening this round
what follows are a few hints of the what I will say
![Page 28: purrr - files.speakerdeck.com€¦ · blah blah blah ok I admit it: FP not actually front of mind when I use purrr. what does purrr help me do? iterate in a data-structure-informed](https://reader033.fdocuments.in/reader033/viewer/2022050212/5f5e94e2e4597c463d45d436/html5/thumbnails/28.jpg)
every time someone asks:
how can I iterate over a list, but also access the index i or the list names at the same time?
they should probably be working inside a data frame, with a list column and a variable for i or the names
use tibble::enframe() on your vexing_list and have at it with mutate(new_var = map_*(vexing_list, f)) or map2() or pmap()
![Page 29: purrr - files.speakerdeck.com€¦ · blah blah blah ok I admit it: FP not actually front of mind when I use purrr. what does purrr help me do? iterate in a data-structure-informed](https://reader033.fdocuments.in/reader033/viewer/2022050212/5f5e94e2e4597c463d45d436/html5/thumbnails/29.jpg)
Great example is Gapminder
draw on
http://r4ds.had.co.nz/many-models.html
and
STAT 545 Gapminder materials (translate from plyr and dplyr)
natural to nest at country level and put data in list-column fit models, etc. by mutating the data list-column extract model summaries by mutating the fits w broom fxns
![Page 30: purrr - files.speakerdeck.com€¦ · blah blah blah ok I admit it: FP not actually front of mind when I use purrr. what does purrr help me do? iterate in a data-structure-informed](https://reader033.fdocuments.in/reader033/viewer/2022050212/5f5e94e2e4597c463d45d436/html5/thumbnails/30.jpg)
more far out example is
https://jennybc.github.io/purrr-tutorial/ex24_xml-wrangling.html
where I put XML nodesets in a data frame each row is one row of a Google Sheet I proceed to wrangle it on the way to get cell contents
![Page 31: purrr - files.speakerdeck.com€¦ · blah blah blah ok I admit it: FP not actually front of mind when I use purrr. what does purrr help me do? iterate in a data-structure-informed](https://reader033.fdocuments.in/reader033/viewer/2022050212/5f5e94e2e4597c463d45d436/html5/thumbnails/31.jpg)
also, just to be clear:
no one in their right mind enjoys having list-columns in a data frame
but the benefits often outweigh the costs especially if you have the right tools and a productive mindset
it’s always a temporary state goal is always to get back to something simpler
![Page 32: purrr - files.speakerdeck.com€¦ · blah blah blah ok I admit it: FP not actually front of mind when I use purrr. what does purrr help me do? iterate in a data-structure-informed](https://reader033.fdocuments.in/reader033/viewer/2022050212/5f5e94e2e4597c463d45d436/html5/thumbnails/32.jpg)
ok this is where things just peter out 😬
and we go back to live coding
![Page 33: purrr - files.speakerdeck.com€¦ · blah blah blah ok I admit it: FP not actually front of mind when I use purrr. what does purrr help me do? iterate in a data-structure-informed](https://reader033.fdocuments.in/reader033/viewer/2022050212/5f5e94e2e4597c463d45d436/html5/thumbnails/33.jpg)
My economic policy speech will be carried live at 12:15 P.M. Enjoy! Join me in Fayetteville, North Carolina tomorrow evening at 6pm. Tickets now available at: https://t.co/Z80d4MYIg8 The media is going crazy. They totally distort so many things on purpose. Crimea, nuclear, "the baby" and so much more. Very dishonest!
I see where Mayor Stephanie Rawlings-Blake of Baltimore is pushing Crooked hard. Look at the job she has done in Baltimore. She is a joke!
Bernie Sanders started off strong, but with the selection of Kaine for V.P., is ending really weak. So much for a movement! TOTAL DISRESPECT
Crooked Hillary Clinton is unfit to serve as President of the U.S. Her temperament is weak and her opponents are strong. BAD JUDGEMENT!
The Cruz-Kasich pact is under great strain. This joke of a deal is falling apart, not being honored and almost dead. Very dumb!
substring(text, first, last)
[[1]][1] -1
[[2]][1] -1
[[3]][1] 20
[[4]][1] 134
[[5]][1] 28 95
[[6]][1] 87 114
[[7]][1] 50 112 123
[[1]][1] -3
[[2]][1] -3
[[3]][1] 24
[[4]][1] 137
[[5]][1] 33 98
[[6]][1] 90 119
[[7]][1] 53 115 126
tweets match_first match_last
https://jennybc.github.io/purrr-tutorial/ex10_trump-tweets.html
pmap(list(text = tweets, first = match_first, last = match_last), substring)