Survey Analysis An attempt to develop an Intuition of Semantic Relatedness.
-
Upload
cornelius-hoover -
Category
Documents
-
view
224 -
download
0
Transcript of Survey Analysis An attempt to develop an Intuition of Semantic Relatedness.
![Page 1: Survey Analysis An attempt to develop an Intuition of Semantic Relatedness.](https://reader030.fdocuments.in/reader030/viewer/2022033105/56649ce25503460f949acbd3/html5/thumbnails/1.jpg)
Survey Analysis
An attempt to develop an Intuition of Semantic Relatedness
![Page 2: Survey Analysis An attempt to develop an Intuition of Semantic Relatedness.](https://reader030.fdocuments.in/reader030/viewer/2022033105/56649ce25503460f949acbd3/html5/thumbnails/2.jpg)
Outline
• Motivation• Survey framework• Analysis
![Page 3: Survey Analysis An attempt to develop an Intuition of Semantic Relatedness.](https://reader030.fdocuments.in/reader030/viewer/2022033105/56649ce25503460f949acbd3/html5/thumbnails/3.jpg)
Motivation• Semantic Relatedness – broad/subjective concept• Given a pair of words –
• Are they related?• If so, to what extent?• What is the kind of relationship between them?
• Answer varies from person to person – depends on his background, culture, work domain etc.
• Example: Apple - Computer
![Page 4: Survey Analysis An attempt to develop an Intuition of Semantic Relatedness.](https://reader030.fdocuments.in/reader030/viewer/2022033105/56649ce25503460f949acbd3/html5/thumbnails/4.jpg)
Existing Datasets
• Rubenstein & Goodenough (1965) – 65 English noun pairs (RG - 65)
• Miller and Charles (1991) – subset of RG-65, 30 English noun pairs (MC - 30)
• Finkelstein et al. (2002) – 353 word pairs (Fin1-153 and Fin2-200)
• Yang and Powers (2006) – 130 verb pairs (YP-130)
![Page 5: Survey Analysis An attempt to develop an Intuition of Semantic Relatedness.](https://reader030.fdocuments.in/reader030/viewer/2022033105/56649ce25503460f949acbd3/html5/thumbnails/5.jpg)
Problems with current datasets
• Part of speech limitation• Focus on semantic similarity instead of
relatedness• Size of dataset usually very small. Constructed
manually. Labor intensive.• Only general terms are included. Lack of
domain specific terms• Provides no insight into the type of SR
![Page 6: Survey Analysis An attempt to develop an Intuition of Semantic Relatedness.](https://reader030.fdocuments.in/reader030/viewer/2022033105/56649ce25503460f949acbd3/html5/thumbnails/6.jpg)
Survey Framework• Was created using 30 word pairs from Miller
and Charles (1991) dataset• Participants were asked to rate the
relatedness on a scale of 0 – 4, 0 being not related at all and 4 being highly related
• They were also asked to specify the kind of relationship
• They were made aware of the fact that 2 words may be related in a variety of ways – Synonymy, Antonymy, Frequent association, is a, part of, domain related etc.
![Page 7: Survey Analysis An attempt to develop an Intuition of Semantic Relatedness.](https://reader030.fdocuments.in/reader030/viewer/2022033105/56649ce25503460f949acbd3/html5/thumbnails/7.jpg)
Survey Framework
• Was conducted among students of IIT Bombay (particularly with a computer science & linguistics background)
• 55 students participated in the survey• Was created using Java Servlet and Tomcat
container
![Page 8: Survey Analysis An attempt to develop an Intuition of Semantic Relatedness.](https://reader030.fdocuments.in/reader030/viewer/2022033105/56649ce25503460f949acbd3/html5/thumbnails/8.jpg)
Screen Shot
![Page 9: Survey Analysis An attempt to develop an Intuition of Semantic Relatedness.](https://reader030.fdocuments.in/reader030/viewer/2022033105/56649ce25503460f949acbd3/html5/thumbnails/9.jpg)
ResultsSerial No. Word pair MC Original (38) MC New (55)
1 Car - Automobile 3.92 3.65
2 Gem - Jewel 3.84 3.22
3 Journey - Voyage 3.84 3.25
4 Boy - Lad 3.76 3.27
5 Coast - Shore 3.7 3.27
6 Asylum - Madhouse 3.61 2.14
7 Magician - Wizard 3.5 2.85
8 Midday - Noon 3.42 3.25
9 Furnace - Stove 3.11 2.34
10 Food - Fruit 3.08 2.78
![Page 10: Survey Analysis An attempt to develop an Intuition of Semantic Relatedness.](https://reader030.fdocuments.in/reader030/viewer/2022033105/56649ce25503460f949acbd3/html5/thumbnails/10.jpg)
Results
Serial No. Word Pair MC Original (38) MC New (55)
11 Bird - Cock 3.05 2.74
12 Bird - Crane 2.97 2.47
13 Tool - Implement 2.95 1.93
14 Brother - Monk 2.82 1.02
15 Lad - Brother 1.66 0.82
16 Crane - Implement 1.68 1.05
17 Journey - Car 1.16 2.18
18 Monk - Oracle 1.1 1.22
19Cemetery - Woodland 0.95 0.8
20 Food - Rooster 0.89 1.31
![Page 11: Survey Analysis An attempt to develop an Intuition of Semantic Relatedness.](https://reader030.fdocuments.in/reader030/viewer/2022033105/56649ce25503460f949acbd3/html5/thumbnails/11.jpg)
Results
Serial No. Word Pair MC Original (38) MC New (55)
21 Coast - Hill 0.87 1.2
22 Forest - Graveyard 0.84 0.74
23 Shore - Woodland 0.63 0.74
24 Monk - Slave 0.55 0.67
25 Coast - Forest 0.42 0.85
26 Lad - Wizard 0.42 0.49
27 Chord - Smile 0.13 0.58
28 Glass - Magician 0.11 0.82
29 Rooster - Voyage 0.08 0.24
30 Noon - String 0.08 0.31
![Page 12: Survey Analysis An attempt to develop an Intuition of Semantic Relatedness.](https://reader030.fdocuments.in/reader030/viewer/2022033105/56649ce25503460f949acbd3/html5/thumbnails/12.jpg)
Graph
![Page 13: Survey Analysis An attempt to develop an Intuition of Semantic Relatedness.](https://reader030.fdocuments.in/reader030/viewer/2022033105/56649ce25503460f949acbd3/html5/thumbnails/13.jpg)
Correlation Coefficient
Correlation between MC new and original = 0.91 – quite strong
Correlation(X,Y)(x x)(y y)(x x)2 (y y)2
![Page 14: Survey Analysis An attempt to develop an Intuition of Semantic Relatedness.](https://reader030.fdocuments.in/reader030/viewer/2022033105/56649ce25503460f949acbd3/html5/thumbnails/14.jpg)
Graph
![Page 15: Survey Analysis An attempt to develop an Intuition of Semantic Relatedness.](https://reader030.fdocuments.in/reader030/viewer/2022033105/56649ce25503460f949acbd3/html5/thumbnails/15.jpg)
Graph
![Page 16: Survey Analysis An attempt to develop an Intuition of Semantic Relatedness.](https://reader030.fdocuments.in/reader030/viewer/2022033105/56649ce25503460f949acbd3/html5/thumbnails/16.jpg)
Graph