Real-Time Social Analytics Ian Hersey
CTO and EVP, Products
The Business Challenge
The BIG DATA wave
Driven by conversations on the Internet, Social Media, Mobile Apps
• 300 Million: Tweets per day • 250 Billion: Emails per day • 800 Million: Facebook users • 126 Million: Blogs • 1.97 Billion: Internet users
worldwide • 5 Trillion: SMS messages
annually • Millions of CRM Records • 100s of Millions of Survey
Verbatims
Social Media: “Human Sensors”
• News, firsthand, secondhand, thirdhand…
• Natural disasters
• Military movements
• Networks, velocity, acceleration
• Opinions
• Products
• Services
• Popular culture (TV, film, music)
• Politics
• Conversations, Comments, Recommendations
• Can sometimes predict (or explain) outcomes
Predictive Power
• Don’t take it to Vegas…
• 90+% success rate if data volumes are sufficient
• Successful business uses involve not just prediction, but engagement
• Product feedback
• Direct customer service
• Analysis of marketing campaign effectiveness (TV, film, music)
• Political outreach/mobilization
• Science Art is still in its infancy
• Equally or more important are the “whys” behind the predictions/outcomes
GOP Florida – Newt Gingrich
Selected Newt Gingrich topics were discussed more at length throughout the day. For example,
being sued by the band Foreign for using the song “Eye Of The Tiger” since 2009 captured the
imagination of Twittersphere …
Another topic that had mileage throughout the day particularly around mid-afternoon. A Ron Paul supporter
was somewhat roughed up at a Gingrich rally. Later in the day Ron Paul’s team
demanded an apology…
There was a sustained campaign to drum up support for Newt Gingrich
GOP Florida – Mitt Romney
There was a sustained campaign to drum up support for Newt Gingrich
One of the key criticisms were jabs aimed at Romney’s wealth in the sense that money and privilege can win you
leadership…
The most consistent theme throughout the day was Romney being a Populist.
Some Major Technical Challenges
• Data scale and rates
• NLP – no “one size fits all” technology
• Multi-channel content acquisition, coverage and quality
• Domain and customer specificity in the metadata
• Combining structured and full-text queries
• Operation by non-linguists
Data Scale and Rates
• Experience with Hadoop, HBase and Solr
• Biggest issues
• “Enterprise friendliness”
• Cannot support low-latency processing
• No current commercial offerings with both SQL and full-text front ends
• “Real-time” analysis scenario
• Match a tweet according to an initial filter
• Do further analysis to determine whether it is “actionable” vs “just a mention”
• Figure out who to route it to with what kind of priority
• All within a handful of seconds from the time it was tweeted
• 2500 times or more per second
• Required development of real-time ingestion and orchestration framework
Real-Time Processing Flow
Firehose
Pipeline
Command Center
Custom Apps
Analyze
Respond
Harvest & Harmonize 150+ Million Sources
Sensemaking & Annotation
Real-Time Content Aggregation
Direct API Access Scrapers, Crawlers, RSS Collectors Aggregators and Syndicators Structured and Unstructured
Social Analytics Application Stack
Natural Language Processing “Reads” Every Communication
I bought an iPad2 for my mom last week. She loves the weight, but
doesn’t like the color. She wishes it came in blue. She says if it came in
blue, then she’d buy one for all her friends.
Entities (brands, people, locations, times, products…)
Events and relationships (purchasing event, my mom…)
Sentiment
Suggestions
Intent (to purchase, to leave)
I:have:mom
I:buy[past]:Product.apple_product.iPad2
Limitations of NLP
• Irony, sarcasm
• “slanguage”
• Who’s talking/tweeting?
• Agendas
• Impact (“opinions are like…”)
• Cross-/multi-language
• Single posts vs. “body of work”
Annotated Data Streams Feed Downstream Applications
Real-Time Processing Pipeline
Advanced Topic Creator
Geotagging
Language ID
Reach
Klout
Entity, Event, Sentiment Tagging
Topic Matcher
Message Tracking
Worker Libraries
All standard content (Twitter, Google+, Facebook, Forums, Blogs, Online News)
SDK Kit and API documentation
Advanced Topic Creator https://smcc.attensity.com/ui
Command Center Concepts and Overview
The Command Center is a highly branded shared experience providing a lens to real-time social media conversations
Command Center screens use a responsive design for the following resolutions
1920x1080 (Most televisions)
1024x728 (Compatible with desktop computers and tablets)
A Command Center implementation is made up of multiple Dashboards
Implementations are hosted by Attensity
Dashboards contain multiple Widgets
Widgets are configurable with lots of options
Endless combinations
Thank you
Top Related