Text analytics for Google Spreadsheets using dataTXT add-on
-
Upload
spaziodati -
Category
Technology
-
view
1.308 -
download
0
description
Transcript of Text analytics for Google Spreadsheets using dataTXT add-on
Doing text analysis inside Google Spreadsheets,
using dataTXT add-on
†
food for thoughts
wait, wait: what’s text analysis?
what’s text analysis?
turn text into data for analysis
turn text into data for analysis
why it’s useful
• Enterprise Business Intelligence/Data Mining, Competitive Intelligence • E-Discovery, Records Management • National Security/Intelligence • Scientific discovery, especially Life Sciences • Sentiment Analysis Tools, Listening Platforms • Natural Language/Semantic Toolkit or Service • Publishing • Automated ad placement • Search/Information Access • Social media monitoring
#textanalysis #dataTXT#gdrive
turn text into data for analysis
usually you have to be a developer, but now you can do a lot of things directly inside Google Spreadsheet, thanks to dataTXT add-on
http://bit.ly/dataTXT-googleSheets
#textanalysis #dataTXT#gdrive
but why is it useful?
†
turn text into data for analysis
infographics, tag clouds, mind maps,graphs, charts…
#textanalysis #dataTXT#gdrive
let’s start from an example…
extract useful informations from a news article published on
#textanalysis
http://edition.cnn.com/2014/09/10/world/rosetta-philae-landing-site/index.html?hpt=hp_t3
#dataTXT#gdrive
#textanalysis #dataTXT#gdrive
copy & paste this text on a Google Sheet…
#textanalysis #dataTXT#gdrive
this is just text: we call it “unstructured data”
if we select the cell, launch dataTXT add-on, and click “Analyze text”…
#textanalysis #dataTXT#gdrive
… we are performing named entity extraction with dataTXT-NEX APIs,
inside the Google Sheet
#textanalysis #dataTXT#gdrive
#textanalysis #dataTXT#gdrive
now, we find something else: a new sheet titled “Analysis” with
a lot of useful stuff…
TEXT -> it’s the original content SPOT -> the label of an “entity”, taken from the original text CONFIDENCE -> it’s a quality score of the matching
WIKIPEDIA URL -> it’s the URL of the entity on Wikipedia
#textanalysis #dataTXT#gdrive
TYPES -> the type of the entity extracted from DBpedia
CATEGORIES -> extracted from DBpedia, it’s useful as tag
so why is it useful?
before: it’s only text
now it’s contextual data
in other words:
#textanalysis #dataTXT#gdrive
the text “67P/Churyumov-Gerasimenko” has now some structured details, like
“categories”: a sort of tag set very useful:
you can do a lot of things with dataTXT add-on for Google Sheets
#textanalysis #dataTXT#gdrive
make a tag cloud using concepts labels ( typed concept )
extract persons cited in a lot of content
build some graph/chart using types found inside the content
extract some data from a lot of tweets (useful for Social Media
consultants and not so many data)
find useful keywords to enrich your content (a better SEO?)
enrich your content with useful links to contextual Wikipedia pages
and all of this without programming :) and inside your own Google Spreadsheet!
#textanalysis #dataTXT#gdrive
democratizing text analytics!
and if you are a smart guy, or a data journalist for example,
you can do something better…
#textanalysis #dataTXT#gdrive
use your Google SpreadSheet as a little database, to build smart interactive web pages
Google Spreadsheet unstructured
data
Google Spreadsheet structured
data
+ dataTXT
and don’t forget: you are using some data taken from the Linked Open Data Cloud without knowing anything
about it!
How-to install dataTXT add-on for Google Sheets
#textanalysis #dataTXT#gdrive
#textanalysis #dataTXT#gdrive
inside a Google Sheet, looking for “dataTXT” inside the store…
http://bit.ly/dataTXT-googleSheets
or using this link at the bottom…
there is a tutorial on dandelion.eu to setup it
http://bit.ly/howto-dataTXT-on-google-sheet
Unleash your creativity, give it a try!
#textanalysis #dataTXT#gdrive
http://bit.ly/dataTXT-googleSheets
@SpazioDati