Post on 21-May-2020
Using emojis as universal sentence representation for social media
data
Alexis DUTOT-
22/05/2019
PARIS NLP S3
MEETUP #5
2
● Introduction
● DeepMoji
● Internal challenges
● Our approach: Unimoji
● Conclusion & perspectives
Introduction
3
4
Introduction
Linkfluence
- Social Media Intelligence company
- Activities: software & market research
- 2 products:
- Radarly
- Search
- 250+ employees over 6 offices
5
Our day-to-day work
Research Production
- Read papers
- Technological watch
- Prototype new features
- Train models
- Science popularization
- Implement new features to fit in the
production pipeline (near real-time
inference)
- Build batch computations for AI features
not computed in real-time
- Enhance the processing pipeline
Introduction
6
Our day-to-day work
6
Production environment
- Research playground
- Machine learning & NLP toolkits
- Programming languages
Introduction
7
Our pipeline
Language detection
NER extraction
Categorization
Opinion mining
Location & user inference
Stats:
● ~ 1200 documents per second
● > 60 languages
● > 10 platforms (social medias & web)
● > 65 models in the pipeline
Introduction
8
Our pipeline
Stats:
● ~ 1200 documents per second
● > 60 languages
● > 10 platforms (social medias & web)
● > 65 models in the pipeline
Introduction
Language detection
NER extraction
Categorization
Opinion mining
Location & user inference
Opinion miningIntroduction
- Sentiment Analysis: Document-level
sentiment analysis with 4 classes: positive,
negative, neutral and mixed
- Emotion detection: Document-level
multi-emotion detection with 7 classes:
anger, disgust, fear, joy, love, sadness and
surprise
9
Introduction
● Initial goal: enhance the sentiment analysis algorithm that was in the production pipeline
● Challenges:
○ Social media posts are noisy user-generated content: spelling mistakes, grammatical errors,
contractions, abbreviations, specific terms, ...
○ Very few annotated corpora available with few examples per corpus
○ The majority of these corpora are in English and “domain-specific”
10
Sentiment analysis task for social media data is limited by the scarcity of manually
annotated data
Opinion mining
Opinion miningIntroduction
Use distant supervision methods to make models learn useful text representations (like emotional
content) before modeling these tasks directly:
● Use specific hashtags: #good, #bad, #angry, #fml to automatically label high volume of data
(Mohammad, 2012)
● Use predefined positive and negative emoticons or emojis sets for automatic data labelling (Deriu
et al., 2016, Tang et al., 2014) → Our previous sentiment analysis model
● Pre-train a model to predict emojis given a document to learn a rich emotional text representation
and fine-tune it on a specific opinion mining task: DeepMoji (Felbo et al., 2017)
11
How can we leverage this “lack” of manually annotated data ?
DeepMoji
12
DeepMoji: leverage the power of emoji to accurately encode the
emotional content of texts.
The power of emoji
Using millions of emoji occurrences to learn any-domain representations for detecting sentiment, emotion and sarcasm (Felbo et al., 2017)
DeepMoji
13
https://deepmoji.mit.edu/
This was soooo FUN !!! 😁😁 [this, was, soo, fun, !!] POSITIVE
→ Build a training set of 1.2B tweets with emojis as noisy labels
This was soooo FUN !!! 😁😁 [this, was, soo, fun, !!] 😁
→ Pre-train a model to predict an emoji probability distribution given a text
→ Fine-tune this model on a specific opinion mining task (sentiment analysis, emotion detection & sarcasm
detection)
The modelDeepMoji
14
2-layers BiLTSM with attention
Pre-training Transfer learning
Fine-tuning is done using the chain-thaw approach:
sequentially fine-tune one layer at a time
Advantages of DeepMojiDeepMoji
15
● SoTA on 3 opinion mining tasks (before BERT’s arrival)
● Really good fit for our use-case: opinion mining on social media posts
● Simple and easy-to-read code written in Keras to perform tests and reproduce results
Internal Challenges
16
Challengesreminder
Internal challenges
17
1. Perform opinion mining on many (>60) languages on every social media
platforms :
DeepMoji requires manually annotated data for each target task and for each
language
2. Handle at least 1200 documents per second without making the hardware
costs skyrocket :
We assume that a Bi-LSTM would not be an option
Multilingual problem
Computational problem
Limitations & resourcesInternal challenges
18
Research environnement
Production environnement
- Hardware: 4 GTX 1080 Ti
- Frameworks: Keras + Tensorflow
Tensorflow offers an “almost” stable
Java API (ONNX or DeepLearning4J not
mature yet)
- CPUs-only production instances
- Current processing pipeline on Apache
Storm (JVM) does not handle
batching
Not ideal for deep learning models
Our ideaUnimoji
19
Deep Learning is awesome !
J’adore mon nouvel iPhone
Detesto el fin de la Casa de Papel… NEGATIVE
POSITIVE
👍 0.35
…
😔 0.002Doc2Emoji
TRAINED ON ENGLISH
❤ 0.68
…
😢 0.001
😡 0.36
…
😂 0.005
POSITIVE
Emoji2SentimentTRAINED ON ENGLISH ANNOTATED CORPORA
Doc2EmojiTRAINED ON FRENCH
Doc2EmojiTRAINED ON SPANISH
Emojis are universal across the languages and are more and more used upon social media platforms
Proof of ConceptInternal challenges
20
● Validating the approach: use DeepMoji pre-trained (predicts emoji probability distribution) + MLP
(predicts sentiment from the distribution)
Small loss of accuracy compared to fine-tuned methods (2-5 points) → acceptable
● Reproduce DeepMoji pre-training on our own English data
● Issues:
1. 1 epoch: 12 days (too long)
2. Inference time in production: 50 ms/input (too slow)
Internal challenges
21
Tackling the computational problem
At this point:
● Impossible to use a RNN architecture in production● Need an alternative...
1. Can we replace the DeepMoji architecture with a computationally cheaper one while preserving a good emotional context representation ?
2. Can this emotional context representation using emojis be used to perform multilingual opinion mining tasks ?
Our approach: Unimoji
22
Doc2EmojiUnimoji
23
Different CNNs architectures tried
Final architecture is a combination of:
● Attentive convolutions (Yin, 2017)
● 2-layers CNN architecture used by SwissCheese team, winners of Task
1-A of SemEval2016 (Deriu et al., 2016)
Light attentive convolution layerDoc2Emoji architecture that we used
EN
FR
ES
Unimoji
24
Statistics:
● Dataset: 512M tweets
● Training: 44 h/epoch (vs 12 days/epoch)
● Predict in production: 5 ms/input (vs 50 ms/epoch)
Our architecture performed almost as good as DeepMoji
Is this representation accurate enough to resolve opinion mining tasks ?
Top 1 and 5 emoji prediction accuracies
EN
FR
ES
Doc2Emoji
Emoji2TaskUnimoji
25
Architecture: 2-layers neural network
Comparing the quality of learnt sentence representations: benchmarking over DeepMoji approach
1. Can we replace the DeepMoji architecture with a computationally cheaper one while preserving a good emotional context representation ?
EN
FR
ES
Emoji2TaskUnimoji
26
EN
FR
ES
2. Can this emotional context representation using emojis be used to perform multilingual opinion mining tasks ?
Train 3 new Doc2Emoji models: French, German, Simplified Chinese
Experiments: Sentiment analysis & Emotion detection
Multilingual Sentiment analysis
Unimoji
27
Training: SemEval 2016 Task 4-A dataset (3 classes: negative, positive, neutral)
Evaluation: internally annotated data in English, German, French & Chinese
Results: (vs previous algorithm)
● English accuracy improvement: ~ +10% (90% acc)
● French accuracy improvement: ~ +7% (87% acc)
● German accuracy improvement: ~ +6% (81% acc)
● Chinese accuracy improvement: ~ -30% (40% acc)
The multilingual approach improved the results for all languages except for Chinese
→ Emojis context in Chinese ≠ Emojis context in English
Unimoji
28
Multilingual Emotion detection
Love & sadness
Anger & disgust
Surprise
Training: SemEval 2018 Task 1-Ec dataset
We kept only 7 emotions : anger, disgust, fear, joy, love, sadness and surprise (multilabel classification)
Evaluation: internally annotated data in English, German, French
Results:
● English accuracy : 85%
● French accuracy: 80%
● German accuracy: 77%
Results → good enough to validate our approach
Unimoji
29
Validating our approach
2. Can this emotional context representation using emojis can be used to perform multilingual opinion mining tasks ?
*If the emotional context in which emojis are used is not too different from the context of the language in which the Emoji2Task was trained on.
*
1. Can we replace the DeepMoji architecture with a computationally cheaper one while preserving a good emotional context representation ?
Conclusion & Perspectives
30
So far...Conclusion & Perspectives
31
● Integrated our Unimoji model for sentiment analysis and emotion detection for 6
languages: French, English, Spanish, Portuguese, German and Italian
● For the Simplified Chinese model, Doc2Emoji model was fine-tuned on a Chinese
sentiment analysis dataset (improving accuracy by ~20%)
● Plan to add more languages to the model...
Key takeawaysConclusion & Perspectives
32
● 10x faster Doc2Emoji architecture based on CNNs with small accuracy loss
● Unimoji = Modular architecture: one can change the Doc2Emoji/Emoji2Task architectures
with any model
● 2 opinions mining tasks trained using the same English emoji probability distribution as
emotional representation:
○ Sentiment analysis (improving our inference accuracy)
○ Emotion detection (new feature !)
Key takeawaysConclusion & Perspectives
33
● Doc2Emoji can be fine-tuned for any language if a reliable manually annotated dataset is
available
● Such model have limitations: different emotional contexts for emoji, different emoji
distribution across 2 languages, ...
What’s next ?
34
Conclusion & Perspectives
● Add more languages
● Continue to explore limitations
● Don’t focus only on emojis
→ Explore Cross-lingual models (LASER, XLM)
● New opinion tasks
→ Saracasm detection, hate detection, optimism/pessimism, ...
35
Thank you !
Questions ?
LONDON
1 Primrose Street, London EC2A 2EXcontact-uk@linkfluence.com
DÜSSELDORF
Erkrather Straße 234b, 40233 Düsseldorfkontakt@linkfluence.com
SHANGHAI上海昌平路68号510-512室 近西苏州路Rm 512, 68 Changping Road, Shanghaicontact-asia@linkfluence.com
SINGAPORECapital Tower #12-01, 168 Robinson Road, 068912 Singaporecontact-asia@linkfluence.com
PARIS
5, rue Choron, 75009 Pariscontact@linkfluence.com
SAN FRANCISCO575 Market Street #11, San Francisco CA 94105contact@linkfluence.com
36