Social Sentiment Indices Powered by X-Scores

14
Brian Davis, Keith Cortis, Laurentiu Vasiliu, Adamantios Koumpis, Ross McDermott, Siegfried Handschuh ssix-project.eu twitter.com/SSIX_project ALLDATA 2016 22nd February 2016 bit.ly/ssix_facebook

Transcript of Social Sentiment Indices Powered by X-Scores

Brian Davis, Keith Cortis, Laurentiu Vasiliu, Adamantios

Koumpis, Ross McDermott, Siegfried Handschuh

ssix-project.eu twitter.com/SSIX_project

ALLDATA 2016

22nd February 2016

bit.ly/ssix_facebook

Project Inspiration

• Studies that demonstrate the predictive power of Social Networks on Financial Markets

Research conclusion considering the statistical results: • There is a relationship (therefore predictive power) between

the content of posts on social networks and the trend of stock market indices

2

Project Objectives

1. Classify and score content using a framework of qualitative and quantitative parameters called X-Scores, regardless of language or data architecture

2. Provide European SMEs with a collection of easy to use tools to analyse and interpret attitudes for any given target

3. Enable SMEs to exploit sentiment characteristics to assist them in creating new products and services resulting in increased revenues

3

Challenges and Contributions of SSIX

1. Aims for an “open source” framework: this must be balanced between commercial partners’ Intellectual Property (IP)

2. Data acquisition, filtering, representative sampling: data ethics, sampling methodology, commoditisation of social media (Facebook closed Public Feed API by April 30th 2015, Twitter dropped Datasift)

3. Global Sentiment Access: deficit of tools for other European languages → SSIX will make major breakthrough into mining multilingual social media streams

4. Multilingual Cost Saving: where cross lingual opinion mining cannot be adapted, relevant snippets to EN will be automatically translated

4

SSIX Templates

• Provide SSIX end users an easy way to personalise and customise specific SSIX platform behaviours without requiring any development effort

• Template made of both configurable files (e.g. XML) and software that implements a number of variables to allow personalisation for any targeted case study

• Advantage: To leverage massive amount of sentiment data produced and published on social media networks within multiple domains

• Other target domains: Government, Health, Politics

5

Big Data Challenges in SSIX

1. Collection and handling of multiple kinds of data:

• Public data from social networking and news sources

• Linguistic Linked Data and Linking Open Data cloud datasets

• Language Resources (LRs) e.g., SentiWordNet (LR for opinion mining), EuroSentiment (marketplace for LRs and services dedicated to sentiment analysis)

2. Diversity in nature of gathered data:

• High volume

• High velocity

• High variety 6

SSIX Workflow

7

● Twitter

● Facebook

● Linkedin

● StockTwits

● Google+

● Blogs

● Forums

● News Feeds

● RSS

● Newsletters

Multilingual NLP Pipeline

Data Management

● Bootstrap with Knowledge based IE - custom

dictionaries and finite state grammars -

hybrid rule/ML based approach

● Adapt shallow NLP tools for social media to

other languages/translate relevant snippets

to EN for mining

8

SSIX Index

SSIX X-Scores ● Raw Scores - time series data streams

direct from NLP

● Statistical Scores for deeper analysis of sentiment behaviour, e.g. Volume, Polarity, Volatility, Averages, etc.

● Influence and reputation scoring

● Custom SSIX Index composed of specific X-Score streams for any topic

● SSIX Index composition using any index formula

9

What SSIX Brings

• Most services are English only, SSIX will provide multilingual opinion mining with the aim to cover the most widely used European languages

• SSIX aims to provide near real-time sentiment analysis in addition to extensive analysis on delayed time series

• SSIX will provide a fully customisable API for the SSIX Index and X-Scores, allowing SMEs to easily integrate SSIX technology into their own platforms

• SSIX will provide an interactive visual and analysis dashboard allowing users to easily understand sentiment dynamics

10

NLP in SSIX – Components and Technologies

Open Source Toolkits: Apache Stanbol/OpenNLP, GATE, Stanford NLP, NLTK

EU Projects Results: MONNET, TrendMiner, LIDER, OpeNER, EuroSentiment

● Knowledge engineering approach initially taken – Finite State Transducer

(FST) Grammars + Custom Dictionaries and Sentiment Lexica

● Opinion Orientated Information Extraction approach using existing open

source tools that are customised/customisable, such as TwitIE in GATE

● Wrap existing ML based approaches in NLP frameworks to social media

sentiment analysis and retrain, e.g., Stanford Twitter Sentiment Analyser

● Adapt existing shallow NLP tools, i.e. tokenisers and POS taggers

● Provide localised language models for the SSIX pipeline

● Provide machine translation for languages that are under-resourced 11

Use Cases - SSIX Industry Pilot Partners

1) Finance - Peracton

Sharpen investment and trading of complex decision making of Peracton’s

MAARS Big Data analytics application by adding custom sentiment metrics

2) Media - 3rdPLACE

Provide deep and reliable information of the finance sector to news providers

through SSIX metrics that will empower their Data Management software –

3rdEYE

3) Multilingual Analytics - Lionbridge

Get structured, comprehensive and actionable analytics on competitors,

customers and prospects from multilingual sources through the analysis of

specific market segments by SSIX 12

SSIX Metrics: Raw Scores – time series data streams from opinion mining techniques and Statistical Scores for deeper analysis of sentiment behaviour

Aims to provide near real-time sentiment analysis in addition to extensive analysis on time frame series with additional data sources

Multiple custom X-Scores data streams can be used to generate a custom SSIX Index for any target

Provide multilingual opinion mining with the aim to cover the most widely used European languages

A fully customisable API for the X-Scores and Index will allow SMEs to easily integrate SSIX technology into their own platforms

An interactive visual and analysis dashboard allowing users to easily understand sentiment dynamics

Summary – SSIX Overview

13

Thank You

@kcortis [email protected]

@SSIX_project ssix-project.eu bit.ly/ssix_facebook