Towards Understanding...Christopher Manning Word distributions word representations 0.286 0.792...
Transcript of Towards Understanding...Christopher Manning Word distributions word representations 0.286 0.792...
Towards UnderstandingChristopher ManningStanford University
Christopher Manning
1980s Natural Language Processing
VP →{ V (NP:(↑ OBJ)=↓ (NP:(↑ OBJ2)=↓) )
(XP:(↑ XCOMP)=↓)
|@(COORD VP VP)}.
salmon N IRR @(CN SALMON)
(↑ PERSON)=3
{ (↑ NUM)=SG|(↑ NUM)=PL}.
Christopher Manning
WRB VBZ DT NN VB TO VB DT
How does a project get to be aNN JJ . : CD NN IN DT NN .
year late ? … One day at a time .
P(late|a, year) = 0.0087P(NN|DT, a, project) = 0.9
Learning language
Christopher Manning
The traditional word representation
motel
[0 0 0 0 0 0 0 0 0 0 1 0 0 0 0]
Dimensionality: 50K (small domain – speech/PTB) – 13M (web – Google 1T)
motel [0 0 0 0 0 0 0 0 0 0 1 0 0 0 0] AND
hotel [0 0 0 0 0 0 0 1 0 0 0 0 0 0 0] = 0
Christopher Manning
Word distributions word representations
0.2860.792
−0.177−0.107
0.109−0.542
0.3490.2710.487
linguistics =
[Bengio et al. 2003, Collobert & Weston 2008, Turian 2010, Mikolov 2013, etc.]
Through corpus linguistics, large chunksthe study of language and linguistics.The field of linguistics is concernedWritten like a linguistics text bookPhonology is the branch of linguistics that
Christopher Manning
Ratios of co-occurrence probabilities can encode meaning components
Crucial insight:
x = solid x = water
large
x = gas
small
x = random
smalllarge
small large large small
~1 ~1large small
Encoding meaning in vector differences[Pennington et al., to appear 2014]
Christopher Manning
Ratios of co-occurrence probabilities can encode meaning components
Crucial insight:
x = solid x = water
1.9 x 10-4
x = gas x = fashion
2.2 x 10-5
1.36 0.96
Encoding meaning in vector differences[Pennington et al., to appear 2014]
8.9
7.8 x 10-4 2.2 x 10-3
3.0 x 10-3 1.7 x 10-5
1.8 x 10-5
6.6 x 10-5
8.5 x 10-2
Christopher Manning
GloVe: A new model for learning word representations [Pennington et al., to appear 2014]
Christopher Manning
Nearest words to frog:
1. frogs2. toad3. litoria4. leptodactylidae5. rana6. lizard7. eleutherodactylus
Word similarities
litoria leptodactylidae
rana eleutherodactylus
Christopher Manning
Model Dimensions Corpus size
Performance(Syn + Sem)
CBOW (Mikolov et al. 2013b) 300 1.6 billion 36.1
CBOW (Mikolov et al. 2013b) 1000 6 billion 63.7
GloVe (this work) 300 6 billion 71.7
Word analogy task [Mikolov, Yih & Zweig 2013a]
Christopher Manning
Machine translation with bilingual neural language models [Devlin et al., ACL 2014]
Christopher Manning
Machine translation with bilingual neural language models [Devlin et al., ACL 2014]
Arabic
1st Place (BBN)
49.5
2nd Place 47.5
… …
9th Place 44.0
10th Place 41.2
NIST 2012 Open MT Arabic Results
Arabic
Previous best BBN system
49.8
+ NNJM 52.8
+ 3.0 BLEU + 6.3 BLEU
“Baseline Hiero” Features: (1) Rule probs, (2) lexical smoothing, (3) KN LM, (4) word penalty, (5) concat penalty
NNJM on best system
Arabic
“Baseline Hiero”
43.4
+ NNJM 49.7
NNJM on “Baseline”
Christopher Manning
Sentence structure: Dependency parsing
Christopher Manning
Universal Stanford Dependencies[de Marneffe et al., LREC 2014]
A common dependency representation and label set applicable across languages – http://universaldependencies.github.io/docs/
Christopher Manning
Sentence structure: Dependency parsing
Christopher Manning
Deep Learning Dependency Parser [Chen & Manning, forthcoming 2014]
Christopher Manning
Deep Learning Dependency Parser [Chen & Manning, forthcoming 2014]
Parser type Parser LAS (Label & Attach)
Sentences / sec
Transition-based
MaltParser(stackproj)
86.9 469
Our parser 89.6 654
Graph-based MSTParser 87.6 10
TurboParser (full) 89.7 8
Christopher Manning
Grounding language meaning with images[Socher, Karpathy, Le, Manning & Ng, TACL 2014]
Christopher Manning
Example dependency tree and image
dep
Christopher Manning
Recursive computation of dependency tree
dep
Christopher Manning
EvaluationData of [Rashtchian, Young, Hodosh & Hockenmaier 2010]
1000 images,5 descriptions each; used as 800 train,100 dev, 100 test
Christopher Manning
Results for image search
Model Mean rank
Random 52.1
Recurrent NN 19.2
Constituency Tree Recursive NN 16.1
kCCA 15.9
Bag of Words 14.6
Dependency Tree Recursive NN 12.5
Lower is better!
Christopher Manning
How to represent the meaning of texts[Le and Mikolov, ICML 2014, Paragraph Vector]
Christopher Manning
Political Ideology Detection Using Recursive Neural Networks[Iyyer, Enns, Boyd-Graber & Resnik 2014]
Christopher Manning
Christopher Manning
Extracting Semantic Relationships[Socher, Huval, Manning & Ng, EMNLP 2012]
My [apartment]e1 has a pretty large [kitchen] e2
component-whole relationship (e2,e1)
Microsoft Privacy Policy statement applies to all information collected. Read at research.microsoft.com
Save the planet and return your name badge before you
leave (on Tuesday)
http://www.freegreatpicture.com/city-impression/trinity-college-dublin-the-old-library-14885
http://nlp.stanford.edu/manning/papers/romance.pdf
http://commons.wikimedia.org/wiki/File:PR2_Robot_reads_the_Mythical_Man-Month_2.jpg
http://en.wikipedia.org/wiki/Litoria#mediaviewer/File:Caerulea_cropped(2).jpg
http://en.wikipedia.org/wiki/Pristimantis_cruentus#mediaviewer/File:Pristimantis_cruentus_studio.jpg
http://en.wikipedia.org/wiki/Edible_frog#mediaviewer/File:Rana_esculenta_on_Nymphaea_edit.JPG
http://en.wikipedia.org/wiki/Eleutherodactylus#mediaviewer/File:Eleutherodactylus_mimus.jpg
http://pascallin.ecs.soton.ac.uk/challenges/VOC/voc2008/