Deep Learning for Natural Language Processing: Word Embeddings

of 99/99
1 @graphific Roelof Pieters Deep Learning for Natural Language Processing: Word Embeddings 3 December 2015 KTH www.csc.kth.se/~roelof/ [email protected]

Embed Size (px)

Transcript of Deep Learning for Natural Language Processing: Word Embeddings

  • 1

    @graphificRoelof Pieters

    Deep Learning for Natural Language Processing: Word

    Embeddings

    3 December 2015 KTH

    www.csc.kth.se/~roelof/ [email protected]

    https://twitter.com/graphifichttp://@graphifichttp://www.csc.kth.se/~roelof/http://www.csc.kth.se/~roelof/mailto:[email protected]

  • Language Understanding

    2

  • Can we understand Language ?

    1. Language is ambiguous:Every sentence has many possible interpretations.

    2. Language is productive:We will always encounter new words or new constructions

    3. Language is culturally specific

    Some of the challenges in Language Understanding:

    3

  • Can we understand Language ?

    1. Language is ambiguous:Every sentence has many possible interpretations.

    2. Language is productive:We will always encounter new words or new constructions

    plays well with others

    VB ADV P NN

    NN NN P DT

    fruit flies like a banana

    NN NN VB DT NN

    NN VB P DT NN

    NN NN P DT NN

    NN VB VB DT NN

    the students went to class DT NN VB P NN

    4

    Some of the challenges in Language Understanding:

  • Can we understand Language ?

    1. Language is ambiguous:Every sentence has many possible interpretations.

    2. Language is productive:We will always encounter new words or new constructions

    5

    Some of the challenges in Language Understanding:

  • [Karlgren 2014, NLP Sthlm Meetup]6

    http://www.apple.com

  • Can we understand Language ?

    1. Language is ambiguous:Every sentence has many possible interpretations.

    2. Language is productive:We will always encounter new words or new constructions

    3. Language is culturally specific

    Some of the challenges in Language Understanding:

    7

  • ML: Traditional Approach

    1. Gather as much LABELED data as you can get

    2. Throw some algorithms at it (mainly put in an SVM and keep it at that)

    3. If you actually have tried more algos: Pick the best

    4. Spend hours hand engineering some features / feature selection / dimensionality reduction (PCA, SVD, etc)

    5. Repeat

    For each new problem/question::

    8

  • Machine Learning for NLP

    Data

    Classic Approach: Data is fed into a learning algorithm:

    Learning Algorithm

    9

  • Machine Learning for NLP

    some of the (many) treebank datasets

    source: http://www-nlp.stanford.edu/links/statnlp.html#Treebanks

    !

    10

    http://www-nlp.stanford.edu/links/statnlp.html#Treebanks

  • Penn TreebankThats a lot of manual work:

    11

  • the students went to class

    DT NN VB P NN

    plays well with others

    VB ADV P NN

    NN NN P DT

    fruit flies like a banana

    NN NN VB DT NN

    NN VB P DT NN

    NN NN P DT NN

    NN VB VB DT NN

    With a lot of issues:

    Penn Treebank

    12

  • Machine Learning for NLP

    Learning AlgorithmData

    Features

    PredictionPrediction/Classifier

    train set

    test set

    13

  • Machine Learning for NLP

    Learning Algorithm

    Features

    PredictionPrediction/Classifier

    train set

    test set

    14

  • One Model rules them all ?DL approaches have been successfully applied to:

    Deep Learning: Why for NLP ?

    Automatic summarization Coreference resolution Discourse analysis

    Machine translation Morphological segmentation Named entity recognition (NER)

    Natural language generation

    Natural language understanding

    Optical character recognition (OCR)

    Part-of-speech tagging

    Parsing

    Question answering

    Relationship extraction

    sentence boundary disambiguation

    Sentiment analysis

    Speech recognition

    Speech segmentation

    Topic segmentation and recognition

    Word segmentation

    Word sense disambiguation

    Information retrieval (IR)

    Information extraction (IE)

    Speech processing

    15

  • Deep Learning: Why for NLP ?

    16

  • What is the meaning of a word? (Lexical semantics)

    What is the meaning of a sentence? ([Compositional] semantics)

    What is the meaning of a longer piece of text?(Discourse semantics)

    Semantics: Meaning

    18

  • NLP treats words mainly (rule-based/statistical approaches at least) as atomic symbols:

    or in vector space:

    also known as one hot representation.

    Its problem ?

    Word Representation

    Love Candy Store

    [0 0 0 0 0 1 0 0 0 0 0 0 0 0 0 0 0 ]

    Candy [0 0 0 0 0 1 0 0 0 0 0 0 0 0 0 0 0 ] ANDStore [0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 1 0 ] = 0 !

    19

  • Word Representation

    20

  • Structure corresponds to meaning:

    Structure and Meaning

    21

  • Semantics

    Syntax

    22

    NLP: what can we work with?

  • Language models define probability distributions over (natural language) strings or sentences

    Joint and Conditional Probability

    Language Model

    23

  • Language models define probability distributions over (natural language) strings or sentences

    Language Model

    24

  • Language models define probability distributions over (natural language) strings or sentences

    Language Model

    25

  • Word senses

    What is the meaning of words?

    Most words have many different senses:dog = animal or sausage?

    How are the meanings of different words related?

    - Specific relations between senses:Animal is more general than dog.

    - Semantic fields:money is related to bank

    26

  • Word senses

    Polysemy:

    A lexeme is polysemous if it has different related senses

    bank = financial institution or building

    Homonyms:

    Two lexemes are homonyms if their senses are unrelated, but they happen to have the same spelling and pronunciation

    bank = (financial) bank or (river) bank

    27

  • Word senses: relations

    Symmetric relations:

    Synonyms: couch/sofaTwo lemmas with the same sense

    Antonyms: cold/hot, rise/fall, in/outTwo lemmas with the opposite sense

    Hierarchical relations:

    Hypernyms and hyponyms: pet/dog The hyponym (dog) is more specific than the hypernym (pet)

    Holonyms and meronyms: car/wheelThe meronym (wheel) is a part of the holonym (car)

    28

  • Distributional representations

    You shall know a word by the company it keeps (J. R. Firth 1957)

    One of the most successful ideas of modern statistical NLP!

    these words represent banking

    Hard (class based) clustering models

    Soft clustering models29

  • Distributional hypothesis

    He filled the wampimuk, passed it around and we all drunk some

    We found a little, hairy wampimuk sleeping behind the tree

    (McDonald & Ramscar 2001)30

  • Distributional semantics

    Landauer and Dumais (1997), Turney and Pantel (2010), 31

  • Distributional semanticsDistributional meaning as co-occurrence vector:

    32

  • Distributional representations

    Taking it further:

    Continuous word embeddings

    Combine vector space semantics with the prediction of probabilistic models

    Words are represented as a dense vector:

    Candy =

    33

  • Word Embeddings: SocherVector Space Model

    adapted rom Bengio, Representation Learning and Deep Learning, July, 2012, UCLA

    In a perfect world:

    34

  • Word Embeddings: SocherVector Space Model

    adapted rom Bengio, Representation Learning and Deep Learning, July, 2012, UCLA

    In a perfect world:

    the country of my birththe place where I was born

    35

  • Can theoretically (given enough units) approximate any function

    and fit to any kind of data

    Efficient for NLP: hidden layers can be used as word lookup tables

    Dense distributed word vectors + efficient NN training algorithms:

    Can scale to billions of words !

    Why Neural Networks for NLP?

    36

  • Representation of words as continuous vectors has a long history (Hinton et al. 1986; Rumelhart et al. 1986; Elman 1990)

    First neural network language model: NNLM (Bengio et al. 2001; Bengio et al. 2003) based on earlier ideas of distributed representations for symbols (Hinton 1986)

    How?

    37

    http://www.iro.umontreal.ca/~lisa/publications2/index.php/attachments/single/74http://www.jmlr.org/papers/volume3/bengio03a/bengio03a.pdf

  • Word Embeddings: SocherVector Space Model

    Figure (edited) from Bengio, Representation Learning and Deep Learning, July, 2012, UCLA

    In a perfect world:

    the country of my birththe place where I was born ?

    38

  • Compositionality

    Principle of compositionality:

    the meaning (vector) of a complex expression (sentence) is determined by:

    Gottlob Frege (1848 - 1925)

    - the meanings of its constituent expressions (words) and

    - the rules (grammar) used to combine them

    39

  • How do we handle the compositionality of language in our models?

    40

    Compositionality

  • How do we handle the compositionality of language in our models?

    Recursion :the same operator (same parameters) is applied repeatedly on different components

    41

    Compositionality

  • How do we handle the compositionality of language in our models?

    Option 1: Recurrent Neural Networks (RNN)

    42

    RNN 1: Recurrent Neural Networks

  • How do we handle the compositionality of language in our models?

    Option 2: Recursive Neural Networks (also sometimes called RNN)

    43

    RNN 2: Recursive Neural Networks

  • achieved SOTA in 2011 on Language Modeling (WSJ AR task) (Mikolov et al., INTERSPEECH 2011):

    and again at ASRU 2011:

    44

    Recurrent Neural Networks

    Comparison to other LMs shows that RNN LMs are state of the art by a large margin. Improvements inrease with more training data.

    [ RNN LM trained on a] single core on 400M words in a few days, with 1% absolute improvement in WER on state of the art setup

    Mikolov, T., Karafiat, M., Burget, L., Cernock, J.H., Khudanpur, S. (2011)Recurrent neural network based language model

    http://www.fit.vutbr.cz/research/groups/speech/publi/2010/mikolov_interspeech2010_IS100722.pdf

  • 45

    Recurrent Neural Networks

    (simple recurrent neural network for LM)

    input

    hidden layer(s)

    output layer

    + sigmoid activation function+ softmax function:

    Mikolov, T., Karafiat, M., Burget, L., Cernock, J.H., Khudanpur, S. (2011)Recurrent neural network based language model

    http://www.fit.vutbr.cz/research/groups/speech/publi/2010/mikolov_interspeech2010_IS100722.pdf

  • 46

    Recurrent Neural Networks

    backpropagation through time

  • 47

    Recurrent Neural Networks

    backpropagation through time

    class based recurrent NN

    [code (Mikolovs RNNLM Toolkit) and more info: http://rnnlm.org/ ]

    http://rnnlm.org/

  • Recursive Neural Network for LM (Socher et al. 2011; Socher 2014)

    achieved SOTA on new Stanford Sentiment Treebank dataset (but comparing it to many other models):

    Recursive Neural Network

    48

    Socher, R., Perelygin,, A., Wu, J.Y., Chuang, J., Manning, C.D., Ng, A.Y., Potts, C. (2013)Recursive Deep Models for Semantic Compositionality Over a Sentiment Treebank

    info & code: http://nlp.stanford.edu/sentiment/

    http://nlp.stanford.edu/~socherr/EMNLP2013_RNTN.pdfhttp://nlp.stanford.edu/sentiment/

  • Recursive Neural Tensor Network

    49

    code & info: http://www.socher.org/index.php/Main/ParsingNaturalScenesAndNaturalLanguageWithRecursiveNeuralNetworks

    Socher, R., Liu, C.C., NG, A.Y., Manning, C.D. (2011) Parsing Natural Scenes and Natural Language with Recursive Neural Networks

    http://www.socher.org/index.php/Main/ParsingNaturalScenesAndNaturalLanguageWithRecursiveNeuralNetworkshttp://nlp.stanford.edu/pubs/SocherLinNgManning_ICML2011.pdf

  • Recursive Neural Tensor Network

    50

  • RNN (Socher et al. 2011a)

    Recursive Neural Network

    51

    Socher, R., Perelygin,, A., Wu, J.Y., Chuang, J., Manning, C.D., Ng, A.Y., Potts, C. (2013)Recursive Deep Models for Semantic Compositionality Over a Sentiment Treebank

    info & code: http://nlp.stanford.edu/sentiment/

    http://nlp.stanford.edu/~socherr/EMNLP2013_RNTN.pdfhttp://nlp.stanford.edu/sentiment/

  • RNN (Socher et al. 2011a)

    Matrix-Vector RNN (MV-RNN) (Socher et al., 2012)

    Recursive Neural Network

    52

    Socher, R., Perelygin,, A., Wu, J.Y., Chuang, J., Manning, C.D., Ng, A.Y., Potts, C. (2013)Recursive Deep Models for Semantic Compositionality Over a Sentiment Treebank

    info & code: http://nlp.stanford.edu/sentiment/

    http://nlp.stanford.edu/~socherr/EMNLP2013_RNTN.pdfhttp://nlp.stanford.edu/sentiment/

  • RNN (Socher et al. 2011a)

    Matrix-Vector RNN (MV-RNN) (Socher et al., 2012)

    Recursive Neural Tensor Network (RNTN) (Socher et al. 2013)

    Recursive Neural Network

    53

  • negation detection:

    Recursive Neural Network

    54

    Socher, R., Perelygin,, A., Wu, J.Y., Chuang, J., Manning, C.D., Ng, A.Y., Potts, C. (2013)Recursive Deep Models for Semantic Compositionality Over a Sentiment Treebank

    info & code: http://nlp.stanford.edu/sentiment/

    http://nlp.stanford.edu/~socherr/EMNLP2013_RNTN.pdfhttp://nlp.stanford.edu/sentiment/

  • NP

    PP/IN

    NP

    DT NN PRP$ NN

    Parse TreeRecurrent NN for Vector Space

    55

  • NP

    PP/IN

    NP

    DT NN PRP$ NN

    Parse Tree

    INDT NN PRP NN

    Compositionality

    56

    Recurrent NN: CompositionalityRecurrent NN for Vector Space

  • NP

    IN

    NP

    PRP NN

    Parse Tree

    DT NN

    Compositionality

    57

    Recurrent NN: CompositionalityRecurrent NN for Vector Space

  • NP

    IN

    NP

    DT NN PRP NN

    PP

    NP (S / ROOT)

    rules meanings

    Compositionality

    58

    Recurrent NN: CompositionalityRecurrent NN for Vector Space

  • Vector Space + Word Embeddings: Socher

    59

    Recurrent NN: CompositionalityRecurrent NN for Vector Space

  • Vector Space + Word Embeddings: Socher

    60

    Recurrent NN for Vector Space

  • Word Embeddings: Turian (2010)

    Turian, J., Ratinov, L., Bengio, Y. (2010). Word representations: A simple and general method for semi-supervised learning

    code & info: http://metaoptimize.com/projects/wordreprs/ 61

    http://metaoptimize.com/projects/wordreprs/

  • Word Embeddings: Turian (2010)

    Turian, J., Ratinov, L., Bengio, Y. (2010). Word representations: A simple and general method for semi-supervised learning

    code & info: http://metaoptimize.com/projects/wordreprs/ 62

    http://metaoptimize.com/projects/wordreprs/

  • Word Embeddings: Collobert & Weston (2011)

    Collobert, R., Weston, J., Bottou, L., Karlen, M., Kavukcuoglu, K., Kuksa, P. (2011) . Natural Language Processing (almost) from Scratch

    63

    http://arxiv.org/pdf/1103.0398v1.pdf

  • Multi-embeddings: Stanford (2012)

    Eric H. Huang, Richard Socher, Christopher D. Manning, Andrew Y. Ng (2012)Improving Word Representations via Global Context and Multiple Word Prototypes

    64

    http://www.socher.org/index.php/Main/ImprovingWordRepresentationsViaGlobalContextAndMultipleWordPrototypes

  • Linguistic Regularities: Mikolov (2013)

    code & info: https://code.google.com/p/word2vec/ Mikolov, T., Yih, W., & Zweig, G. (2013). Linguistic Regularities in Continuous Space Word Representations

    65

    https://code.google.com/p/word2vec/http://research.microsoft.com/pubs/189726/rvecs.pdf

  • Word Embeddings for MT: Mikolov (2013)

    Mikolov, T., Le, V. L., Sutskever, I. (2013) . Exploiting Similarities among Languages for Machine Translation

    66

    http://arxiv.org/pdf/1309.4168.pdf

  • Word Embeddings for MT: Kiros (2014)

    67

  • Recursive Deep Models & Sentiment: Socher (2013)

    Socher, R., Perelygin, A., Wu, J., Chuang, J.,Manning, C., Ng, A., Potts, C. (2013) Recursive Deep Models for Semantic Compositionality Over a Sentiment Treebank.

    code & demo: http://nlp.stanford.edu/sentiment/index.html68

    http://nlp.stanford.edu/~socherr/EMNLP2013_RNTN.pdfhttp://nlp.stanford.edu/sentiment/index.html

  • Paragraph Vectors: Le & Mikolov (2014)

    Le, Q., Mikolov,. T. (2014) Distributed Representations of Sentences and Documents69

    add context (sentence, paragraph, document) to word vectors during training

    !

    Results on Stanford Sentiment Treebank dataset:

    http://www-cs.stanford.edu/~quocle/paragraph_vector.pdf

  • Paragraph Vectors: Dai et al. (2014)

    70

  • Paragraph Vectors: Dai et al. (2014)

    71

  • Paragraph Vectors: Dai et al. (2014)

    72

  • Global Vectors, GloVe: Stanford (2014)

    Pennington, P., Socher, R., Manning,. D.M. (2014). GloVe: Global Vectors for Word Representation

    code & demo: http://nlp.stanford.edu/projects/glove/

    vsresults on the word analogy task

    similar accuracy

    73

    http://nlp.stanford.edu/projects/glove/glove.pdfhttp://nlp.stanford.edu/projects/glove/https://twitter.com/RichardSocher/status/497235079903473664https://docs.google.com/document/d/1ydIujJ7ETSZ688RGfU5IMJJsbxAi-kRl8czSwpti15s/mobilebasic?pli=1

  • Dependency-based Embeddings: Levy & Goldberg (2014)

    Levy, O., Goldberg, Y. (2014). Dependency-Based Word Embeddings

    code & demo: https://levyomer.wordpress.com/2014/04/25/dependency-based-word-embeddings/

    - Syntactic Dependency Context

    Australian scientist discovers star with telescope

    - Bag of Words (BoW) Context

    0.3$

    0.4$

    0.5$

    0.6$

    0.7$

    0.8$

    0.9$

    1$

    0$ 0.1$ 0.2$ 0.3$ 0.4$ 0.5$ 0.6$ 0.7$ 0.8$ 0.9$ 1$

    Prec

    ision

    $

    Recall$

    Dependency-based embeddings have more

    functional similarities

    74

    http://www.cs.bgu.ac.il/~yoavg/publications/acl2014syntemb.pdfhttps://levyomer.wordpress.com/2014/04/25/dependency-based-word-embeddings/

  • LSTMS

    Attention

    Wanna Play ?

    Recent breakthroughs

    75

  • LSTMS

    Attention

    Wanna Play ?

    Recent breakthroughs

    76

  • Wanna Play ?

    LSTM

    77

  • LSTMS

    Attention

    Wanna Play ?

    Recent breakthroughs

    78

  • Attention

    Gregor et al (2015) DRAW: A Recurrent Neural Network For Image Generation (arxiv) (code)

    http://arxiv.org/abs/1502.04623http://gitxiv.com/posts/caZHv4PNrGMZAh6tY/draw-deep-recurrent-attentive-writer-network

  • Question-Answering Systems (&Memory)

    Summarization

    Text Generation

    Dialogue Systems

    Image Captioning & other multimodal tasks

    Wanna Play ?

    Recent breakthroughs

    80

  • Question-Answering Systems (&Memory)

    Summarization

    Text Generation

    Dialogue Systems

    Image Captioning & other multimodal tasks

    Wanna Play ?

    Recent breakthroughs

    81

  • Wanna Play ?

    QA & Memory

    82

    Memory Networks (Weston et al 2015) Dynamic Memory Network (Kumar et al 2015) Neural Turing Machine (Graves et al 2014)

    FacebookMetamindDeepMind

    Weston et al (2015) Memory Networks (arxiv)

    http://arxiv.org/abs/1410.3916http://arxiv.org/abs/1506.07285http://arxiv.org/abs/1410.5401http://arxiv.org/abs/1410.3916

  • QA & Memory

    83

    Yyer et al. (2014) A Neural Network for Factoid Question Answering over Paragraphs (paper)

    https://cs.umd.edu/~miyyer/qblearn/

  • Wanna Play ?

    QA & Memory

    84

    Memory Networks (Weston et al 2015) Dynamic Memory Network (Kumar et al 2015) Neural Turing Machine (Graves et al 2014)

    FacebookMetamindDeepMind

    Zaremba & Sutskever (2015) Learning to Execute (arxiv)

    http://arxiv.org/abs/1410.4615

  • Wanna Play ?

    QA & Memory

    85

    Babl Dataset

    https://research.facebook.com/researchers/1543934539189348

  • Question-Answering Systems (&Memory)

    Summarization

    Text Generation

    Dialogue Systems

    Image Captioning & other multimodal tasks

    Wanna Play ?

    Recent breakthroughs

    86

  • Wanna Play ?

    Text generation

    87

    Karpathy (2015), The Unreasonable Effectiveness of Recurrent Neural Networks (blog)

    http://karpathy.github.io/2015/05/21/rnn-effectiveness/

  • Karpathy (2015), The Unreasonable Effectiveness of Recurrent Neural Networks (blog)

    http://karpathy.github.io/2015/05/21/rnn-effectiveness/

  • Question-Answering Systems (&Memory)

    Summarization

    Text Generation

    Dialogue Systems

    Image Captioning & other multimodal tasks

    Wanna Play ?

    Recent breakthroughs

    91

  • Image-Text Embeddings

    92

    Socher et al (2013) Zero Shot Learning Through Cross-Modal Transfer (info)

    http://www.socher.org/index.php/Main/Zero-ShotLearningThroughCross-ModalTransfer

  • Image-Captioning

    Andrej Karpathy Li Fei-Fei , 2015. Deep Visual-Semantic Alignments for Generating Image Descriptions (pdf) (info) (code)

    Oriol Vinyals, Alexander Toshev, Samy Bengio, Dumitru Erhan , 2015. Show and Tell: A Neural Image Caption Generator (arxiv)

    Kelvin Xu, Jimmy Ba, Ryan Kiros, Kyunghyun Cho, Aaron Courville, Ruslan Salakhutdinov, Richard Zemel, Yoshua Bengio, Show, Attend and Tell: Neural Image Caption Generation with Visual Attention (arxiv) (info) (code)

    http://cs.stanford.edu/people/karpathy/cvpr2015.pdfhttp://cs.stanford.edu/people/karpathy/deepimagesent/https://github.com/karpathy/neuraltalk2http://arxiv.org/abs/1411.4555http://arxiv.org/abs/1502.03044http://kelvinxu.github.io/projects/capgen.htmlhttps://github.com/kelvinxu/arctic-captions

  • A person riding a motorcycle on a dirt road.???

    Image-Captioning

    Kelvin Xu, Jimmy Ba, Ryan Kiros, Kyunghyun Cho, Aaron

  • Two hockey players are fighting over the puck.???

    Image-Captioning

  • A stop sign is flying in blue skies.

    A herd of elephants flying in the blue skies.

    Elman Mansimov, Emilio Parisotto, Jimmy Lei Ba, Ruslan Salakhutdinov, 2015. Generating Images from Captions with Attention (arxiv) (examples)

    Image-Captioning

    http://arxiv.org/abs/1511.02793http://www.cs.toronto.edu/~emansim/cap2im.html

  • TensorFlow: Recently released library by Google. http://tensorflow.org

    Theano - CPU/GPU symbolic expression compiler in python (from LISA lab at University of Montreal). http://deeplearning.net/software/theano/

    Caffe - Computer Vision oriented Deep Learning framework: caffe.berkeleyvision.org

    Torch - Matlab-like environment for state-of-the-art machine learning algorithms in lua (from Ronan Collobert, Clement Farabet and Koray Kavukcuoglu) http://torch.ch/

    more info: http://deeplearning.net/software links/

    Wanna Play ? General Deep Learning

    97

    http://tensorflow.orghttp://deeplearning.net/software/theano/http://caffe.berkeleyvision.orghttp://torch.ch/http://deeplearning.net/software%20links/

  • RNNLM (Mikolov) http://rnnlm.org

    NB-SVM https://github.com/mesnilgr/nbsvm

    Word2Vec (skipgrams/cbow)https://code.google.com/p/word2vec/ (original) http://radimrehurek.com/gensim/models/word2vec.html (python)

    GloVehttp://nlp.stanford.edu/projects/glove/ (original) https://github.com/maciejkula/glove-python (python)

    Socher et al / Stanford RNN Sentiment code:http://nlp.stanford.edu/sentiment/code.html

    Deep Learning without Magic Tutorial: http://nlp.stanford.edu/courses/NAACL2013/

    Wanna Play ? NLP

    98

    http://rnnlm.orghttps://github.com/mesnilgr/nbsvmhttps://code.google.com/p/word2vec/http://radimrehurek.com/gensim/models/word2vec.htmlhttp://nlp.stanford.edu/projects/glove/https://github.com/maciejkula/glove-pythonhttp://nlp.stanford.edu/sentiment/code.htmlhttp://nlp.stanford.edu/courses/NAACL2013/

  • Questions?

    [email protected]/~roelof/

    99

    Code & Papers:

    Collaborative Open Computer Science.com

    @graphific

    mailto:[email protected]://www.csc.kth.se/~roelof/https://twitter.com/graphifichttp://@graphific