Introduction to Text Analysis
MLA Annual ConventionGetting Started in the Digital Humanities
January 9, 2014
Lauren F. KleinGeorgia Institute of Technology
[email protected]@laurenfklein
Introduction to Text Analysis
• What is text analysis?
Introduction to Text Analysis
• What is text analysis?• Why should you use it?
Introduction to Text Analysis
• What is text analysis?• Why should you use it?• How do you use it?– Examples– Tools
What is Text Analysis?
What is Text Analysis?According to Geoffrey Rockwell:
• “Text analysis systems can search large texts quickly. They do this by preparing electronic indexes to the text so that the computer does not have to read through the entire text. When finding words can be done so quickly that it is "interactive", it changes how you can work with the text - you can serendipitously explore without being frustrated by the slowness of the search process.
• “Text analysis systems can conduct complex searches. Text analysis systems will often allow you to search for lists of words or for complex patterns of words. For example you can search for the co-occurrence of two words.
• “Text analysis systems can present the results in ways that suit the study of texts. Text analysis systems can display the results in a number of ways; for example, a Keyword In Context display shows you all the occurrences of the found word with one line of context.”
http://tada.mcmaster.ca/Main/WhatTA
http://www.wordle.net
http://www.wordle.net
Mark Hansen and Ben Rubin Movable Type
Why Use Text Analysis?
Why Use Text Analysis?Geoff Rockwell, again:
• “Text analysis tools aide the interpreter asking questions of electronic texts.”• “Text analysis practices encourage reflection on the questions asked and
formalization of queries.”• “Text analysis is a way of targeting rereading that tests intuitions.”
Why Use Text Analysis?Geoff Rockwell, again:
• “Text analysis tools aide the interpreter asking questions of electronic texts.”• “Text analysis practices encourage reflection on the questions asked and
formalization of queries.”• “Text analysis is a way of targeting rereading that tests intuitions.”
Why Use Text Analysis?Geoff Rockwell, again:
• “Text analysis tools aide the interpreter asking questions of electronic texts.”• “Text analysis practices encourage reflection on the questions asked and
formalization of queries.”• “Text analysis is a way of targeting rereading that tests intuitions.”
Ted Underwood:• “Proving a literary thesis with statistical analysis is often like cracking a nut with a
jackhammer. You can do it: but the results are not necessarily better than you would get by hand.”
Why Use Text Analysis?Geoff Rockwell, again:
• “Text analysis tools aide the interpreter asking questions of electronic texts.”• “Text analysis practices encourage reflection on the questions asked and
formalization of queries.”• “Text analysis is a way of targeting rereading that tests intuitions.”
Ted Underwood:• “Proving a literary thesis with statistical analysis is often like cracking a nut with a
jackhammer. You can do it: but the results are not necessarily better than you would get by hand.”
What I think (in the spirit of Movable Type):• Text analysis as “a way to tell a new story.”
How to Use Text Analysis?
Ben Blatt, http://www.slate.com/articles/arts/culturebox/2013/11/hunger_games_catching_fire_a_textual_analysis_of_suzanne_collins_novels.html
Sarah Lohman, http://www.fourpoundsflour.com/the-gallery-data-visualization-of-a-timeline-of-taste/
Daniel, http://lkleincourses.lmc.gatech.edu/dh12/2012/02/22/the-role-of-senses-in-a-study-in-scarlet/
Ted Underwood and Jordan Sellers, http://journalofdigitalhumanities.org/1-2/the-emergence-of-literary-diction-by-ted-underwood-and-jordan-sellers/
Rob Nelson, http://dsl.richmond.edu/dispatch/
Matt Jockers, http://www.nbcnews.com/technology/data-mining-classics-makes-beautiful-science-954577
Matt Jockers, from Macroanalysis (Univ. of Illinois Press, 2013)
Lauren Klein, from “The Image of Absence” (American Literature 85.4)
Tools for Text Analysis
• Wordle • Google Ngram Viewer • IBM Many Eyes • Voyant • MONK (requires institutional access)• MALLET• Stanford’s Natural Language Processing Toolkit• R
Google Ngram Viewer
Google Ngram Viewerhttps://books.google.com/ngrams
IBM Many Eyes
Many Eyeshttp://www-958.ibm.com/software/analytics/manyeyes/
Voyant Tools
Voyant Toolshttp://voyant-tools.org/
MALLET
MALLEThttp://mallet.cs.umass.edu/
Stanford NLP Toolkit
Stanford NLP Toolkithttp://nlp.stanford.edu/downloads/
R Programming Language
R (programming language)http://www.r-project.org/
TAPoR
TAPoR (Text Analysis PoRtal)http://tapor.ca/
More Lists of Tools
• http://toolingup.stanford.edu/?page_id=367
• http://guides.library.upenn.edu/dhtextanalysis
• http://dirt.projectbamboo.org/categories/text-mining
Many Eyes Demo
http://lkle.in/1bTr2eT
Voyant Tools Demo
http://lkle.in/1e186zN
Top Related