Conceptual vectors for NLP Lexical functions
description
Transcript of Conceptual vectors for NLP Lexical functions
![Page 1: Conceptual vectors for NLP Lexical functions](https://reader035.fdocuments.in/reader035/viewer/2022081519/56813b2d550346895da3f565/html5/thumbnails/1.jpg)
Conceptual vectors for NLPLexical functions
MMA 2001
Mathieu LafourcadeLIRMM - France
www.lirmm.fr/~lafourca
![Page 2: Conceptual vectors for NLP Lexical functions](https://reader035.fdocuments.in/reader035/viewer/2022081519/56813b2d550346895da3f565/html5/thumbnails/2.jpg)
• Semantic AnalysisWord Sense DisambiguationText Indexing in IRLexical Transfer in MT
• Conceptual vectorReminiscent of Vector Models (Salton, Sowa, LSI)
Applied on pre-selected concepts (not terms)
Concepts are not independent
• Propagation on morpho-syntactic tree (no surface analysis)
Objectives
![Page 3: Conceptual vectors for NLP Lexical functions](https://reader035.fdocuments.in/reader035/viewer/2022081519/56813b2d550346895da3f565/html5/thumbnails/3.jpg)
Conceptual vectors• An idea
= a combination of concepts = a vector
• The Idea space= vector space
• A concept= an idea = a vector = combination of itself + neighborhood
• Sense space = vector space + vector set
![Page 4: Conceptual vectors for NLP Lexical functions](https://reader035.fdocuments.in/reader035/viewer/2022081519/56813b2d550346895da3f565/html5/thumbnails/4.jpg)
Conceptual vectors
• Annotations• Helps building vectors• Can take the form of vectors
• Set of k basic concepts — example• Thesaurus Larousse = 873 concepts• A vector = a 873 uple• Encoding for each dimension C = 215
![Page 5: Conceptual vectors for NLP Lexical functions](https://reader035.fdocuments.in/reader035/viewer/2022081519/56813b2d550346895da3f565/html5/thumbnails/5.jpg)
Vector constructionConcept vectors
• H : Thesaurus hierarchy
• V(Ci) : <a1, …, ai, … , an>• aj = 1/ (2 ** Dum(H, i))
1/41 1/41/41/161/16 1/64 1/64
2 64
![Page 6: Conceptual vectors for NLP Lexical functions](https://reader035.fdocuments.in/reader035/viewer/2022081519/56813b2d550346895da3f565/html5/thumbnails/6.jpg)
Vector constructionConcept vectors
• C : mammals• L4 : zoologie, mammals, birds, fish, …• L3 : animals, plants, living beings• L2 : … , time, movement, matter, life ,
… , • L1 : the society, the mankind, the world
![Page 7: Conceptual vectors for NLP Lexical functions](https://reader035.fdocuments.in/reader035/viewer/2022081519/56813b2d550346895da3f565/html5/thumbnails/7.jpg)
0
5000
10000
15000
20000
25000
30000
131 61 91
121151181211241271301331361391421451481511541571601631661691721751781811841871
Série1
Vector constructionConcept vectors
mammals
mammals
zoology, birds, fishes, …living beings, plants
the world the mankind, the society
![Page 8: Conceptual vectors for NLP Lexical functions](https://reader035.fdocuments.in/reader035/viewer/2022081519/56813b2d550346895da3f565/html5/thumbnails/8.jpg)
Vector constructionTerm vectors
• Example : cat• Kernel
c:mammal, c:strokemammal + stroke
• Augmented with weightsc:mammal, c:stroke, 0.75*c:zoology, 0.75*c:love … zoology + mammal + 0.75 stroke + 0.75 love …
• Iteration for neighborhood augmentation
![Page 9: Conceptual vectors for NLP Lexical functions](https://reader035.fdocuments.in/reader035/viewer/2022081519/56813b2d550346895da3f565/html5/thumbnails/9.jpg)
0
2000
4000
6000
8000
10000
12000
135 69
103137171205239273307341375409443477511545579613647681715749783817851
Série1
Vector constructionTerm vectors
Cat
mammals
stroke
![Page 10: Conceptual vectors for NLP Lexical functions](https://reader035.fdocuments.in/reader035/viewer/2022081519/56813b2d550346895da3f565/html5/thumbnails/10.jpg)
Vector space• Basic concepts are not independent• Sense space
= Generator Space of a real k’ vector space (unknown)
= Dim k’ k• Relative position of points
![Page 11: Conceptual vectors for NLP Lexical functions](https://reader035.fdocuments.in/reader035/viewer/2022081519/56813b2d550346895da3f565/html5/thumbnails/11.jpg)
Conceptual vector distance
• Angular Distance DA(x, y) = angle (x, y)
• 0 DA(x, y) • if 0 then colinear - same idea• if /2 then nothing in common• if then DA(x, -x) with -x as anti-idea of x
x’
y
x
![Page 12: Conceptual vectors for NLP Lexical functions](https://reader035.fdocuments.in/reader035/viewer/2022081519/56813b2d550346895da3f565/html5/thumbnails/12.jpg)
Conceptual vector distance• Distance = acos(similarity)
DA(x, y) = acos(x.y/|x||y|))
DA(x, x) = 0
DA(x, y) = DA(y, x)
DA(x, y) + DA(y, z) DA(x, z)
DA(0, 0) = 0 and DA(x, 0) = /2 by definition
DA(x, y) = DA(x, y) with 0
DA(x, y) = - DA(x, y) with < 0
DA(x+x, x+y) = DA(x, x+y) DA(x, y)
![Page 13: Conceptual vectors for NLP Lexical functions](https://reader035.fdocuments.in/reader035/viewer/2022081519/56813b2d550346895da3f565/html5/thumbnails/13.jpg)
Conceptual vector distance• Example
• DA(tit, tit) = 0
• DA(tit, passerine) = 0.4
• DA(tit, bird) = 0.7
• DA(tit, train) = 1.14
• DA(tit, insect) = 0.62
tit = kind of insectivorous passerine …
![Page 14: Conceptual vectors for NLP Lexical functions](https://reader035.fdocuments.in/reader035/viewer/2022081519/56813b2d550346895da3f565/html5/thumbnails/14.jpg)
Conceptual lexicon
Set of (word, vector) = (w, )*
Monosemyword
1 meaning 1 vector
(w, )
tit
![Page 15: Conceptual vectors for NLP Lexical functions](https://reader035.fdocuments.in/reader035/viewer/2022081519/56813b2d550346895da3f565/html5/thumbnails/15.jpg)
Conceptual lexiconPolyseme building
Polysemyword
n meanings n vectors
{(w, ), (w.1, 1) … (w.n, n) }
bank
![Page 16: Conceptual vectors for NLP Lexical functions](https://reader035.fdocuments.in/reader035/viewer/2022081519/56813b2d550346895da3f565/html5/thumbnails/16.jpg)
Squash isolated meanings against numerous close meanings
Conceptual lexicon Polyseme building
(w) = (w.i) = .i
bank : bank.1: Mound bank.3: River border, …bank.2: Money institutionbank.3: Organ keyboard
bank.4: …
bank
![Page 17: Conceptual vectors for NLP Lexical functions](https://reader035.fdocuments.in/reader035/viewer/2022081519/56813b2d550346895da3f565/html5/thumbnails/17.jpg)
Conceptual lexicon Polyseme building
(w) = classification(w.i)
bank
1:DA(3,4) & (3+2)
2:(bank4)
7:(bank2)6:(bank1)
4:(bank3) 5: DA(6,7)& (6+7)
3: DA(4,5) & (4+5)
![Page 18: Conceptual vectors for NLP Lexical functions](https://reader035.fdocuments.in/reader035/viewer/2022081519/56813b2d550346895da3f565/html5/thumbnails/18.jpg)
Lexical scope• LS(w) = LSt((w))
LSt((w)) = 1 if is a leaf
LSt((w)) = (LS(1) + LS(2))/(2-sin(D((w)))otherwise
(w) = t((w))t((w)) = (w) if is a leaf
t((w)) = LS(1)t(1) + LS(2)t(2)
otherwise
1:D(3,4), (3+2)
2:4
7:26:1
4:3 5:D(6,7), (6+7)
3:D(4,5), (4+5)
Can handle duplicated definitions
(w) =
![Page 19: Conceptual vectors for NLP Lexical functions](https://reader035.fdocuments.in/reader035/viewer/2022081519/56813b2d550346895da3f565/html5/thumbnails/19.jpg)
Vector Statistics• Norm ()
• [0 , 1] * C (215=32768)
• Intensity ()• Norm / C • Usually = 1
• Standard deviation (SD)• SD2 = variance• variance = 1/n * (xi - )2 with as the arith
mean
![Page 20: Conceptual vectors for NLP Lexical functions](https://reader035.fdocuments.in/reader035/viewer/2022081519/56813b2d550346895da3f565/html5/thumbnails/20.jpg)
Vector Statistics• Variation coefficient (CV)
CV = SD / meanNo unity - Norm independentPseudo Conceptual strength
If A Hyperonym B CV(A) > CV(B)
(we don’t have )
• vector « fruit juice » (N)MEAN = 527, SD = 973 CV = 1.88
• vector « drink » (N)MEAN = 443, SD = 1014 CV = 2.28
![Page 21: Conceptual vectors for NLP Lexical functions](https://reader035.fdocuments.in/reader035/viewer/2022081519/56813b2d550346895da3f565/html5/thumbnails/21.jpg)
Vector operations
• Sum• V = X + Y vi = xi + yi
• Neutral element : 0• Generalized to n terms : V = Vi
• Normalization of sum : vi /|V|* c
Kind of mean
![Page 22: Conceptual vectors for NLP Lexical functions](https://reader035.fdocuments.in/reader035/viewer/2022081519/56813b2d550346895da3f565/html5/thumbnails/22.jpg)
Vector operations
• Term to term product• V = X Y vi = xi * yi
• Neutral element : 1• Generalized to n terms V = Vi
Kind of intersection
![Page 23: Conceptual vectors for NLP Lexical functions](https://reader035.fdocuments.in/reader035/viewer/2022081519/56813b2d550346895da3f565/html5/thumbnails/23.jpg)
Vector operations
• Amplification• V = X ^ n vi = sg(vi) * |vi|^ n
V = V ^ 1/2 and n V = V ^ 1/n
V V = V ^ 2 if vi 0
• Normalization of ttm product to n terms
V = n Vi
![Page 24: Conceptual vectors for NLP Lexical functions](https://reader035.fdocuments.in/reader035/viewer/2022081519/56813b2d550346895da3f565/html5/thumbnails/24.jpg)
Vector operations
• Product + sum• V = X Y = ( X Y ) + X + Y• Generalized n terms : V = nVi + Vi
• Simplest request vector computation in IR
![Page 25: Conceptual vectors for NLP Lexical functions](https://reader035.fdocuments.in/reader035/viewer/2022081519/56813b2d550346895da3f565/html5/thumbnails/25.jpg)
Vector operations
• Subtraction• V = X Y vi = xi yi
• Dot subtraction• V = X Y vi = max (xi yi, 0)
• Complementary• V = C(X) vi = (1 xic) * c
• etc.
Set operations
![Page 26: Conceptual vectors for NLP Lexical functions](https://reader035.fdocuments.in/reader035/viewer/2022081519/56813b2d550346895da3f565/html5/thumbnails/26.jpg)
Intensity Distance• Intensity of normalized ttm product
• 0 ( (X Y)) 1 if |x| = |y| = 1
DI(X, Y) = acos(( X Y))
• DI(X, X) = 0 and DI(X, 0) = /2
DI(tit, tit) = 0 (DA = 0)
DI(tit, passerine) = 0.25 (DA = 0.4)
DI(tit, bird) = 0.58 (DA = 0.7)
DI(tit, train) = 0.89 (DA = 1.14)
DI(tit, insect) = 0.50 (DA = 0.62)
![Page 27: Conceptual vectors for NLP Lexical functions](https://reader035.fdocuments.in/reader035/viewer/2022081519/56813b2d550346895da3f565/html5/thumbnails/27.jpg)
Relative synonymy
• SynR(A, B, C) — C as reference feature
SynR(A, B, C) = DA(AC, BC)
• DA(coal,night) = 0.9
• SynR(coal, night, color) = 0.4
• SynR(coal, night, black) = 0.35
![Page 28: Conceptual vectors for NLP Lexical functions](https://reader035.fdocuments.in/reader035/viewer/2022081519/56813b2d550346895da3f565/html5/thumbnails/28.jpg)
Relative synonymy• SynR(A, B, C) = SynR(B, A, C)
• SynR(A, A, C) = D (A C, A C) = 0
• SynR(A, B, 0) = D (0, 0) = 0
• SynR(A, 0, C) = /2
• SynA(A, B) = SynR(A, B, 1)
= D (A 1, B 1)= D (A, B)
![Page 29: Conceptual vectors for NLP Lexical functions](https://reader035.fdocuments.in/reader035/viewer/2022081519/56813b2d550346895da3f565/html5/thumbnails/29.jpg)
Subjective synonymy
• SynS(A, B, C) — C as point of view
SynS(A, B, C) = D(C-A, C-B)
0 SynS(A, B, C) normalization:
0 asin(sin(SynS(A, B, C))) /2
B
A C
C’
C”
’
”
![Page 30: Conceptual vectors for NLP Lexical functions](https://reader035.fdocuments.in/reader035/viewer/2022081519/56813b2d550346895da3f565/html5/thumbnails/30.jpg)
Subjective synonymyWhen |C| then SynS(A, B, C) 0
SynS(A, B, 0) = D(-B, -A) = D(A, B)
SynS(A, A, C) = D(C-A, C-A) = 0
SynS(A, B, B) = SynS(A, B, A) = 0
• SynS(tit, swallow, animal) = 0.3
• SynS(tit, swallow, bird) = 0.4
• SynS(tit, swallow, passerine) = 1
![Page 31: Conceptual vectors for NLP Lexical functions](https://reader035.fdocuments.in/reader035/viewer/2022081519/56813b2d550346895da3f565/html5/thumbnails/31.jpg)
68
Semantic analysis
• Vectors propagate on syntactic tree
Les rapidement
P
GV
GVA
GNP
termites
attaquent
les fermes
GN
GN
du toit
The white ants strike rapidly the trusses of the roof
![Page 32: Conceptual vectors for NLP Lexical functions](https://reader035.fdocuments.in/reader035/viewer/2022081519/56813b2d550346895da3f565/html5/thumbnails/32.jpg)
Semantic analysis
Les rapidement
P
GV
GVA
GNPattaquent
fermes
termites
les
GN
toit
GN
dufarm businessfarm building
truss
roofanatomyabove
to startto attack
to Critisize
![Page 33: Conceptual vectors for NLP Lexical functions](https://reader035.fdocuments.in/reader035/viewer/2022081519/56813b2d550346895da3f565/html5/thumbnails/33.jpg)
Semantic analysis
• Initialization - attach vectors to nodes
Les rapidement
P
GV
GVA
GNP
termites
attaquent
les fermes
GN
GN
du toit
1
2
3
4
5
![Page 34: Conceptual vectors for NLP Lexical functions](https://reader035.fdocuments.in/reader035/viewer/2022081519/56813b2d550346895da3f565/html5/thumbnails/34.jpg)
Semantic analysis
• Propagation (up)
Les rapidement
P
GV
GVA
GNP
termites
attaquent
les fermes
GN
GN
du toit
![Page 35: Conceptual vectors for NLP Lexical functions](https://reader035.fdocuments.in/reader035/viewer/2022081519/56813b2d550346895da3f565/html5/thumbnails/35.jpg)
Semantic analysis
• Back propagation (down)(Ni j) = ((Ni j) (Ni)) + (Ni j)
Les rapidement
P
GV
GVA
GNP
termites
attaquent
les fermes
GN
GN
du toit
1’
2’ 3’
4’
5’
![Page 36: Conceptual vectors for NLP Lexical functions](https://reader035.fdocuments.in/reader035/viewer/2022081519/56813b2d550346895da3f565/html5/thumbnails/36.jpg)
Semantic analysis
• Sense selection or sorting
Les rapidement
P
GV
GVA
GNP
termites
attaquent les ferme
s
GN
GN
du toitfarm businessfarm building
truss
roofanatomyabove
to start to attack
to Critisize
![Page 37: Conceptual vectors for NLP Lexical functions](https://reader035.fdocuments.in/reader035/viewer/2022081519/56813b2d550346895da3f565/html5/thumbnails/37.jpg)
Sense selection
1:D(3,4) & (3+2)
2:(bank4)
7:(bank2)6:(bank1)
4:(bank3) 5:D(6,7)& (6+7)
3:D(4,5) & (4+5)
• Recursive descent • on t(w) as decision tree• DA(’, i)
Stop on a leafStop on an internal node
![Page 38: Conceptual vectors for NLP Lexical functions](https://reader035.fdocuments.in/reader035/viewer/2022081519/56813b2d550346895da3f565/html5/thumbnails/38.jpg)
Vector syntactic schemas
• S: NP(ART,N) (NP) = V(N)
• S: NP1(NP2,N) (NP1) = (NP1)+ (N) 0<<1
(sail boat) = (sail) + 1/2 (boat)
(boat sail) = 1/2 (boat) + (sail)
![Page 39: Conceptual vectors for NLP Lexical functions](https://reader035.fdocuments.in/reader035/viewer/2022081519/56813b2d550346895da3f565/html5/thumbnails/39.jpg)
Vector syntactic schemas• Not necessary linear• S: GA(GADV(ADV),ADJ)
(GA) = (ADJ)^p(ADV)
• p(very) = 2 • p(mildly) = 1/2
(very happy) = (happy)^2(mildly happy) = (happy)^1/2
![Page 40: Conceptual vectors for NLP Lexical functions](https://reader035.fdocuments.in/reader035/viewer/2022081519/56813b2d550346895da3f565/html5/thumbnails/40.jpg)
Iteration & convergence
• Iteration with convergence
Local D(i, i+1) for top
Global D(i, i+1) for all
Good results but costly
![Page 41: Conceptual vectors for NLP Lexical functions](https://reader035.fdocuments.in/reader035/viewer/2022081519/56813b2d550346895da3f565/html5/thumbnails/41.jpg)
Lexicon construction
• Manual kernel• Automatic definition analysis• Global infinite loop = learning• Manual adjustments
iterations
synonyms
Ukn word in definitions
kernel
![Page 42: Conceptual vectors for NLP Lexical functions](https://reader035.fdocuments.in/reader035/viewer/2022081519/56813b2d550346895da3f565/html5/thumbnails/42.jpg)
Applicationmachine translation
• Lexical transfer source target
• Knn search that minimizes DA(source, target)
• Submeaning selectionDirectTransformation matrix
chat cat
![Page 43: Conceptual vectors for NLP Lexical functions](https://reader035.fdocuments.in/reader035/viewer/2022081519/56813b2d550346895da3f565/html5/thumbnails/43.jpg)
ApplicationInformation Retrieval on Texts
• Textual document indexation• Language dependant
• Retrieval • Language independent - Multilingual
• Domain representationhorse equitation
• GranularityDocument, paragraphs, etc.
![Page 44: Conceptual vectors for NLP Lexical functions](https://reader035.fdocuments.in/reader035/viewer/2022081519/56813b2d550346895da3f565/html5/thumbnails/44.jpg)
ApplicationInformation Retrieval on Texts
• Index = Lexicon = (di , i )*
Knn search that minimizes DA((r), (di))
doc1
docn doc2
request
![Page 45: Conceptual vectors for NLP Lexical functions](https://reader035.fdocuments.in/reader035/viewer/2022081519/56813b2d550346895da3f565/html5/thumbnails/45.jpg)
Search engine Distances adjustments
• Min DA((r), (di)) may pose problems
• Especially with small documents • Correlation between CV & conceptual
richness• Pathological cases
« plane » and « plane plane plane plane … »« inundation » « blood » D = 0.85 (liquid)
![Page 46: Conceptual vectors for NLP Lexical functions](https://reader035.fdocuments.in/reader035/viewer/2022081519/56813b2d550346895da3f565/html5/thumbnails/46.jpg)
Search engine Distances adjustments
• Correction with relative intensity • Request vs retrieved doc (r and d)
D = (DA(r , d) * DI(r , d))
• 0 (r , d) 1 0 DI(r , d) /2
![Page 47: Conceptual vectors for NLP Lexical functions](https://reader035.fdocuments.in/reader035/viewer/2022081519/56813b2d550346895da3f565/html5/thumbnails/47.jpg)
Conclusion• Approach
• statistical (but not probabilistic)• thema (and rhema ?)
• Combination of• Symbolic methods (IA)• Transformational systems
• Similarity• Neural nets• With large Dim (> 50000 ?)
![Page 48: Conceptual vectors for NLP Lexical functions](https://reader035.fdocuments.in/reader035/viewer/2022081519/56813b2d550346895da3f565/html5/thumbnails/48.jpg)
Conclusion• Self evaluation
• Vector quality• Tests against corpora
• Unknown words• Proper nouns of person, products, etc.
• Lionel Jospin, Danone, Air France• Automatic learning
• Badly handled phenomena?• Negation & Lexical functions (Meltchuk)
![Page 49: Conceptual vectors for NLP Lexical functions](https://reader035.fdocuments.in/reader035/viewer/2022081519/56813b2d550346895da3f565/html5/thumbnails/49.jpg)
End1. extremity2. death3. aim…