A Knowledge-Based information Extraction Prototype for Data-Rich Documents in the Information...
-
Upload
sgjimenezv -
Category
Documents
-
view
216 -
download
0
Transcript of A Knowledge-Based information Extraction Prototype for Data-Rich Documents in the Information...
-
7/31/2019 A Knowledge-Based information Extraction Prototype for Data-Rich Documents in the Information Technology Domain
1/171
-
7/31/2019 A Knowledge-Based information Extraction Prototype for Data-Rich Documents in the Information Technology Domain
2/171
-
7/31/2019 A Knowledge-Based information Extraction Prototype for Data-Rich Documents in the Information Technology Domain
3/171
-
7/31/2019 A Knowledge-Based information Extraction Prototype for Data-Rich Documents in the Information Technology Domain
4/171
-
7/31/2019 A Knowledge-Based information Extraction Prototype for Data-Rich Documents in the Information Technology Domain
5/171
-
7/31/2019 A Knowledge-Based information Extraction Prototype for Data-Rich Documents in the Information Technology Domain
6/171
-
7/31/2019 A Knowledge-Based information Extraction Prototype for Data-Rich Documents in the Information Technology Domain
7/171
-
7/31/2019 A Knowledge-Based information Extraction Prototype for Data-Rich Documents in the Information Technology Domain
8/171
-
7/31/2019 A Knowledge-Based information Extraction Prototype for Data-Rich Documents in the Information Technology Domain
9/171
m
m
-
7/31/2019 A Knowledge-Based information Extraction Prototype for Data-Rich Documents in the Information Technology Domain
10/171
-
7/31/2019 A Knowledge-Based information Extraction Prototype for Data-Rich Documents in the Information Technology Domain
11/171
cardA(.)
W
-
7/31/2019 A Knowledge-Based information Extraction Prototype for Data-Rich Documents in the Information Technology Domain
12/171
cardA()
P
-
7/31/2019 A Knowledge-Based information Extraction Prototype for Data-Rich Documents in the Information Technology Domain
13/171
-
7/31/2019 A Knowledge-Based information Extraction Prototype for Data-Rich Documents in the Information Technology Domain
14/171
-
7/31/2019 A Knowledge-Based information Extraction Prototype for Data-Rich Documents in the Information Technology Domain
15/171
-
7/31/2019 A Knowledge-Based information Extraction Prototype for Data-Rich Documents in the Information Technology Domain
16/171
-
7/31/2019 A Knowledge-Based information Extraction Prototype for Data-Rich Documents in the Information Technology Domain
17/171
-
7/31/2019 A Knowledge-Based information Extraction Prototype for Data-Rich Documents in the Information Technology Domain
18/171
-
7/31/2019 A Knowledge-Based information Extraction Prototype for Data-Rich Documents in the Information Technology Domain
19/171
-
7/31/2019 A Knowledge-Based information Extraction Prototype for Data-Rich Documents in the Information Technology Domain
20/171
-
7/31/2019 A Knowledge-Based information Extraction Prototype for Data-Rich Documents in the Information Technology Domain
21/171
-
7/31/2019 A Knowledge-Based information Extraction Prototype for Data-Rich Documents in the Information Technology Domain
22/171
,
-
7/31/2019 A Knowledge-Based information Extraction Prototype for Data-Rich Documents in the Information Technology Domain
23/171
O = {S,C,h,g}
S
Cp
Ct
h h : (Cp S) (Cp Ct)
h hg
g : Cp (Cp Ct)g g
O OL = {S,C,h,g,L,f}
L
f f : (c1, c2, . . . , cn) L n 1ci C c1, c2, . . . , cn h
g Lf
h g f
OO
S C h g
-
7/31/2019 A Knowledge-Based information Extraction Prototype for Data-Rich Documents in the Information Technology Domain
24/171
-
7/31/2019 A Knowledge-Based information Extraction Prototype for Data-Rich Documents in the Information Technology Domain
25/171
Brand
MHz
MBKB
DecimalReal Integer
GHz
GB
Laptop
Processor
ProductID
Family
FrequencyMagnitude
Model
Speed
FrequencyMeasurement
FrequencyUnits
Cache
SizeLevel
FSBInstalled Max.Installable
Memory
MemorySize
MemoryMagnitude
Magnitude
MemoryUnits
HardDisk ...
...
......
...
...
...
...
has-parthas-attribute
(is-a)-1
Initialnode
Internalnode
Terminalnode
-
7/31/2019 A Knowledge-Based information Extraction Prototype for Data-Rich Documents in the Information Technology Domain
26/171
118104
56
26
14 145 7 6 1 1 1 1 1
0
40
80
120
0 1 2 3 4 5 6 7 8 9 10 11 12 23
node out-degree
countingofnodes
(Lap
top
)
(Tim
eU
nit
s)
(Distance
Un
its
)
(Ba
ttery
)
(Hard
Disk)
( )terminalnodes
100
1510
6 4 2 1 3 1 1 1 1 1 1 1
1 2 3 4 5 6 7 8 9 12 13 15 17 18 45
node in-degree
countingofnodes
(Boo
lean
)
(Bran
d)
(Integer)
(Fam
ily
)
(Magn
itu
de)
(Tec
hno
logy
)
(Version
)
(Mo
de
l,Digit)
(TimeMeasurement)
350307
-
7/31/2019 A Knowledge-Based information Extraction Prototype for Data-Rich Documents in the Information Technology Domain
27/171
-
7/31/2019 A Knowledge-Based information Extraction Prototype for Data-Rich Documents in the Information Technology Domain
28/171
REF_HardDisk
[HardDisk,HD,HDD,HardDiskDrive,Disk,HardDrive]
CONCEPT
TERMS
TOKENS
-
7/31/2019 A Knowledge-Based information Extraction Prototype for Data-Rich Documents in the Information Technology Domain
29/171
-
7/31/2019 A Knowledge-Based information Extraction Prototype for Data-Rich Documents in the Information Technology Domain
30/171
-
7/31/2019 A Knowledge-Based information Extraction Prototype for Data-Rich Documents in the Information Technology Domain
31/171
-
7/31/2019 A Knowledge-Based information Extraction Prototype for Data-Rich Documents in the Information Technology Domain
32/171
-
7/31/2019 A Knowledge-Based information Extraction Prototype for Data-Rich Documents in the Information Technology Domain
33/171
-
7/31/2019 A Knowledge-Based information Extraction Prototype for Data-Rich Documents in the Information Technology Domain
34/171
-
7/31/2019 A Knowledge-Based information Extraction Prototype for Data-Rich Documents in the Information Technology Domain
35/171
-
7/31/2019 A Knowledge-Based information Extraction Prototype for Data-Rich Documents in the Information Technology Domain
36/171
-
7/31/2019 A Knowledge-Based information Extraction Prototype for Data-Rich Documents in the Information Technology Domain
37/171
smsn m n O(mn)
O(min(m, n))
-
7/31/2019 A Knowledge-Based information Extraction Prototype for Data-Rich Documents in the Information Technology Domain
38/171
S U N D A Y
0 1 2 3 4 5 6
S 1 0 1 2 3 4 5
A 2 1 1 2 3 3 4
T 3 2 2 2 3 4 4
U 4 3 2 3 3 4 5
R 5 4 3 3 4 4 5
D 6 5 4 4 3 4 5
A 7 6 5 5 4 3 4
Y 8 7 6 6 5 4 3
k
G
d(c1, c2)
a b
c || ||a ca,
c,a ca,b
-
7/31/2019 A Knowledge-Based information Extraction Prototype for Data-Rich Documents in the Information Technology Domain
39/171
-
7/31/2019 A Knowledge-Based information Extraction Prototype for Data-Rich Documents in the Information Technology Domain
40/171
sm sn m n
dbag(sm, sn) = max{|sm sn|, |sn sm|}
a,a,a,b a,a,b,c,c a
O(mn)O(m + n)
x, y
N CD(x, y) =C(xy) min {C(x), C(y)}
max {C(x), C(y)}C(x) x C(xy)
x y C
C(xx) = C(x) C() = 0
C(xy) C(x)C(xy) = C(yx)
C(xy) + C(z) C(xz) + C(yz)
C
-
7/31/2019 A Knowledge-Based information Extraction Prototype for Data-Rich Documents in the Information Technology Domain
41/171
-
7/31/2019 A Knowledge-Based information Extraction Prototype for Data-Rich Documents in the Information Technology Domain
42/171
|AB||AB| R9
max(|A\B|c,|B\A|c)n
2|AB||A|+|B| R10
min(|A\B,|B\A|)max(|A\B|,|B\A|)
|AB||A||B|
S17|AB|
max(|A\B|c,|B\A|c)
2|AB||AB|+|AB| R12
min(|A\B,|cB\A|c)max(|A\B|c,|B\A|c)
|AB|min(|A|,|B|)max(|A|,|B|) R13
min(|A\B,|B\A|)|AB|
|AB|min(|A|,|B|) R14
min(|A|,|B|)|AB|
|A B| R15 min(|A\B|c,|B\A|c)
n|AB|C
nRc3
|AB|c
min(|A|c,|B|c)|AB|
nRc5
|AB|c
|AB|c
R1 |AB|max(|A|,|B|) Rc8 max(|A|c
,|B|c
)|AB|c
R2|AB|c
max(|A\B|c,|B\A|c) Rc14
min(|A|c,|B|c)|AB|c
R4|AB|c
min(|A\B|c,|B\A|c) Sc17
|AB|c
max(|A\B|c,|B\A|c)
R7max(|A\B|,|B\A|)
|AB| Sc18
|AB|c
n
R8max(|A|,|B|)
|AB|
a bsim
simMongeElkan(a, b) =1
|a||a|i=1
max|b|j=1sim(ai, bj)
O(|a| |b|)
O(min(|a|, |b|)3)
-
7/31/2019 A Knowledge-Based information Extraction Prototype for Data-Rich Documents in the Information Technology Domain
43/171
-
7/31/2019 A Knowledge-Based information Extraction Prototype for Data-Rich Documents in the Information Technology Domain
44/171
-
7/31/2019 A Knowledge-Based information Extraction Prototype for Data-Rich Documents in the Information Technology Domain
45/171
-
7/31/2019 A Knowledge-Based information Extraction Prototype for Data-Rich Documents in the Information Technology Domain
46/171
m
simMongeElkanm(a, b) =
1
|a||a|i=1
max
|b|j=1sim(ai, bj)
m1
m
Jaccard(A, B) =|A B||A B|
|.||A B| = |A| + |B| |A B|
Jaccard(A, B) =|A| + |B| |A B|
|A B|
23 = 0.6666
05 = 0
A B m nA a1, a2,...,an b1, b2,...,bm B
card(.)
Jaccard(A, B) =card(A) + card(B) card(A B)
card(A
B)
card(.)
card(a1, a2,...,an) = 1, if (a1 = a2 = ... = an)
-
7/31/2019 A Knowledge-Based information Extraction Prototype for Data-Rich Documents in the Information Technology Domain
47/171
-
7/31/2019 A Knowledge-Based information Extraction Prototype for Data-Rich Documents in the Information Technology Domain
48/171
-
7/31/2019 A Knowledge-Based information Extraction Prototype for Data-Rich Documents in the Information Technology Domain
49/171
-
7/31/2019 A Knowledge-Based information Extraction Prototype for Data-Rich Documents in the Information Technology Domain
50/171
-
7/31/2019 A Knowledge-Based information Extraction Prototype for Data-Rich Documents in the Information Technology Domain
51/171
-
7/31/2019 A Knowledge-Based information Extraction Prototype for Data-Rich Documents in the Information Technology Domain
52/171
rel1 rel2
1 editDistance(A,B)
max(|A|,|B|)
#commonBigrams(A,B)max(|A|,|B|)
sim(a, b)
sim(a, b)
m = 0.00001
m = 0.5
m = 1.5
m = 2
-
7/31/2019 A Knowledge-Based information Extraction Prototype for Data-Rich Documents in the Information Technology Domain
53/171
card(AB)card(AB)
2card(AB)card(A)+card(B)
card(AB)card(A)card(B)2card(AB)
card(AB)+card(AB)card(AB)min(card(A),card(B))
max(card(A),card(B))card(AB)
min(card(A),card(B))card(AB)
max(card(A),card(B))max(card(A),card(B))
card(AB)
min(card(A),card(B))card(AB)
m = 5
m = 10
m simmax(a, b) = max|a|i=1 max|b|j=1 sim(ai, bj)
sim(a, b)
cardA(.)
cardA(.)
cardA(.)
-
7/31/2019 A Knowledge-Based information Extraction Prototype for Data-Rich Documents in the Information Technology Domain
54/171
-
7/31/2019 A Knowledge-Based information Extraction Prototype for Data-Rich Documents in the Information Technology Domain
55/171
0
0,2
0,4
0,6
0,8
1
0 0,2 0,4 0,6 0,8 1
similarity threshold
recall precision F-measure
r
r
1 1
-
7/31/2019 A Knowledge-Based information Extraction Prototype for Data-Rich Documents in the Information Technology Domain
56/171
0
0,2
0,4
0,6
0,8
1
0 0,2 0,4 0,6 0,8 1
recall
precision
0
0,2
0,4
0,6
0,8
1
0 0,2 0,4 0,6 0,8 1
recall
precision
-
7/31/2019 A Knowledge-Based information Extraction Prototype for Data-Rich Documents in the Information Technology Domain
57/171
m1 m2 m1 m2
H0
m1 m2
m1 m2
n
W
-
7/31/2019 A Knowledge-Based information Extraction Prototype for Data-Rich Documents in the Information Technology Domain
58/171
n
WW
n
Wn
W n = 12
W
-
7/31/2019 A Knowledge-Based information Extraction Prototype for Data-Rich Documents in the Information Technology Domain
59/171
-
7/31/2019 A Knowledge-Based information Extraction Prototype for Data-Rich Documents in the Information Technology Domain
60/171
WilcoxonsRate =test
75
sim(a, b)
WW n = 12
-
7/31/2019 A Knowledge-Based information Extraction Prototype for Data-Rich Documents in the Information Technology Domain
61/171
-
7/31/2019 A Knowledge-Based information Extraction Prototype for Data-Rich Documents in the Information Technology Domain
62/171
-
7/31/2019 A Knowledge-Based information Extraction Prototype for Data-Rich Documents in the Information Technology Domain
63/171
sim(a, b) W
sim(a, b) W
m
m = 1m
m
W nW
-
7/31/2019 A Knowledge-Based information Extraction Prototype for Data-Rich Documents in the Information Technology Domain
64/171
-
7/31/2019 A Knowledge-Based information Extraction Prototype for Data-Rich Documents in the Information Technology Domain
65/171
-
7/31/2019 A Knowledge-Based information Extraction Prototype for Data-Rich Documents in the Information Technology Domain
66/171
-
7/31/2019 A Knowledge-Based information Extraction Prototype for Data-Rich Documents in the Information Technology Domain
67/171
-
7/31/2019 A Knowledge-Based information Extraction Prototype for Data-Rich Documents in the Information Technology Domain
68/171
-
7/31/2019 A Knowledge-Based information Extraction Prototype for Data-Rich Documents in the Information Technology Domain
69/171
= 0.75
-
7/31/2019 A Knowledge-Based information Extraction Prototype for Data-Rich Documents in the Information Technology Domain
70/171
IC(c)c
SI MLeacock&Chodorow(a, b) = log
pathLength(a, b)
2D
depth(x)x
SI MWu&Palmer(a, b) = 2 depth(lcs(a, b))pathLength(a, b) + 2 depth(lcs(a, b))
-
7/31/2019 A Knowledge-Based information Extraction Prototype for Data-Rich Documents in the Information Technology Domain
71/171
Brand
MHzDecimalReal GHz
Laptop
Processor
ProductID
Family
FrequencyMagnitude
Model
Speed
Frequency
FrequencyUnits
Cache FSB BusWidth
Memory HardDisk...
...
...
...
...
SI MpathLenght(a, b) = depth(a) + depth(b) 2 lcs(a, b)
LenFactor(a, b) =pathLength(a, b)
2 D
Spec(x) =depth(x)
clusterDepth(x)
SpecFactor(a, b) = |Spec(a) Spec(b)|
SI MAltinas(a, b) =1
1 + LenFactor(a, b) + SpecFactor(a, b)
clusterDepth(x)
-
7/31/2019 A Knowledge-Based information Extraction Prototype for Data-Rich Documents in the Information Technology Domain
72/171
-
7/31/2019 A Knowledge-Based information Extraction Prototype for Data-Rich Documents in the Information Technology Domain
73/171
Brand
MHzDecimalReal GHz
Laptop
Processor
ProductID
Family
FrequencyMagnitude
Model
Speed
Frequency
FrequencyUnits
Ca ch e FS B BusWidth
Memory HardDisk...
...
...
...
...
...
w=1
w=2
w=3
w=4w=4
w=3
WeightedPathLength (Brand,GHz)=3+4+4+3+2+1=17
-
7/31/2019 A Knowledge-Based information Extraction Prototype for Data-Rich Documents in the Information Technology Domain
74/171
-
7/31/2019 A Knowledge-Based information Extraction Prototype for Data-Rich Documents in the Information Technology Domain
75/171
-
7/31/2019 A Knowledge-Based information Extraction Prototype for Data-Rich Documents in the Information Technology Domain
76/171
-
7/31/2019 A Knowledge-Based information Extraction Prototype for Data-Rich Documents in the Information Technology Domain
77/171
w1 w2 w3 w4 w5
senses
words
t3 t5t5
-
7/31/2019 A Knowledge-Based information Extraction Prototype for Data-Rich Documents in the Information Technology Domain
78/171
Term
senses
tokens
t1 t2 t3 t4 t5
-
7/31/2019 A Knowledge-Based information Extraction Prototype for Data-Rich Documents in the Information Technology Domain
79/171
-
7/31/2019 A Knowledge-Based information Extraction Prototype for Data-Rich Documents in the Information Technology Domain
80/171
-
7/31/2019 A Knowledge-Based information Extraction Prototype for Data-Rich Documents in the Information Technology Domain
81/171
-
7/31/2019 A Knowledge-Based information Extraction Prototype for Data-Rich Documents in the Information Technology Domain
82/171
distance = 1 normalizedSimilarity
-
7/31/2019 A Knowledge-Based information Extraction Prototype for Data-Rich Documents in the Information Technology Domain
83/171
Integer
Laptop
HardDisk
MemorySize
MemoryMagnitude
MaxInstallable
Integer
Laptop
HardDisk
MemorySize
MemoryMagnitude
MaxInstallable
Integer
Laptop
HardDisk
MemorySize
MemoryMagnitude
MaxInstallable
Integer
Laptop
HardDisk
MemorySize
MemoryMagnitude
MaxInstallable
Integer
Laptop
HardDisk
MemorySize
MemoryMagnitude
MaxInstallable
Integer
Laptop
VideoAdapter
MemorySize
MemoryMagnitude
MaxInstallable
Integer
Laptop
VideoAdapter
MemorySize
MemoryMagnitude
Installed
Integer
Laptop
Memory
MemorySize
MemoryMagnitude
MaxInstallable
Integer
Laptop
Memory
MemorySize
MemoryMagnitude
Installed
Laptop
Processor
CacheSize
MemorySize
MemoryMagnitude
Cache
Integer
Megabyte
Laptop
VideoAdapter
MemorySize
MemoryUnits
MaxInstallable
Megabyte
Laptop
VideoAdapter
MemorySize
MemoryUnits
Installed
Megabyte
Laptop
Memory
MemorySize
MemoryUnits
MaxInstallable
Megabyte
Laptop
Memory
MemorySize
MemoryUnits
Installed
Laptop
Processor
CacheSize
MemorySize
MemoryUnits
Cache
Megabyte
MB512... ...
Laptop
Processor
Cache
REF_Cache
Cache
-
7/31/2019 A Knowledge-Based information Extraction Prototype for Data-Rich Documents in the Information Technology Domain
84/171
MB Cache512...
...
..
.
...
...
Semantic
Path#1
Semantic
Path #5
Semantic
Path #5
Semantic
Path #6
Semantic
Path #n
Semantic
Path#4
Semantic
Path#4
Semantic
Path #2
Semantic
Path #2
Semantic
Path #3
Semantic
Path #3
Semantic
Path #1
Semantic
Path #1
w
Semantic
RelatednessMetric
P
-
7/31/2019 A Knowledge-Based information Extraction Prototype for Data-Rich Documents in the Information Technology Domain
85/171
-
7/31/2019 A Knowledge-Based information Extraction Prototype for Data-Rich Documents in the Information Technology Domain
86/171
P
-
7/31/2019 A Knowledge-Based information Extraction Prototype for Data-Rich Documents in the Information Technology Domain
87/171
Atoken#1
Atoken#1
Btoken#2
Btoken#2
token#3
Ctoken#4
Dtoken#5
token#5
Etoken#6
token#6
token#7
token#7
token#8
token#8
Xtoken#3
Ytoken#4
truepositives
falsepositives
falsenegatives
truenegatives
targe
t
se
lec
ted
(B)(A)
[0, 1]
precision = T PT P + F P
recall =T P
T P + F N
F measure = 2 precision recallprecision + recall
-
7/31/2019 A Knowledge-Based information Extraction Prototype for Data-Rich Documents in the Information Technology Domain
88/171
-
7/31/2019 A Knowledge-Based information Extraction Prototype for Data-Rich Documents in the Information Technology Domain
89/171
0
0.2
0.4
0.6
0.8
1
0 0.2 0.4 0.6 0.8 1
threshold
Recall
Precision
F-measure
Recall-Baseline
Precision-Baseline
F-measure-Baseline
0
0.2
0.4
0.6
0.8
1
0 0.2 0.4 0.6 0.8 1
Recall
Precision
-
7/31/2019 A Knowledge-Based information Extraction Prototype for Data-Rich Documents in the Information Technology Domain
90/171
-
7/31/2019 A Knowledge-Based information Extraction Prototype for Data-Rich Documents in the Information Technology Domain
91/171
-
7/31/2019 A Knowledge-Based information Extraction Prototype for Data-Rich Documents in the Information Technology Domain
92/171
0
0.2
0.4
0.6
0.8
1
0 25 50 75 100
noisy lexicon level
F1-score
ExactMatch-SimpleStr ExactMatch-MongeElkan
EditDistance-SimpleStr EditDistance-MongeElkan
0
0.2
0.4
0.6
0.8
1
0 25 50 75 100
noisy lexicon level
F1-score
ExactMatch-SimpleStr ExactMatch-MongeElkan
EditDistance-SimpleStr EditDistance-MongeElkan
-
7/31/2019 A Knowledge-Based information Extraction Prototype for Data-Rich Documents in the Information Technology Domain
93/171
-
7/31/2019 A Knowledge-Based information Extraction Prototype for Data-Rich Documents in the Information Technology Domain
94/171
0
0.3
0.6
0.9
0.5 0.6 0.7 0.8 0.9 1
0
0.3
0.6
0.9
0.5 0.6 0.7 0.8 0.9 1
0
0.3
0.6
0.9
0.5 0.6 0.7 0.8 0.9 1
0
0.3
0.6
0.9
0.5 0.6 0.7 0.8 0.9 1
0
0.3
0.6
0.9
0.5 0.6 0.7 0.8 0.9 1
0
0.3
0.6
0.9
0.5 0.6 0.7 0.8 0.9 1
0
0.3
0.6
0.9
0.5 0.6 0.7 0.8 0.9 1
0
0.3
0.6
0.9
0.5 0.6 0.7 0.8 0.9 1
0
0.3
0.6
0.9
0.5 0.6 0.7 0.8 0.9 1
0
0.3
0.6
0.9
0.5 0.6 0.7 0.8 0.9 1
0
0.3
0.6
0.9
0.5 0.6 0.7 0.8 0.9 1
0
0.3
0.6
0.9
0.5 0.6 0.7 0.8 0.9 1
0
0.3
0.6
0.9
0.5 0.6 0.7 0.8 0.9 1
0
0.3
0.6
0.9
0.5 0.6 0.7 0.8 0.9 1
0
0.3
0.6
0.9
0.5 0.6 0.7 0.8 0.9 1
EditDistance-MongeElkan
2grams(Dice)-MongeElkan0
EditDistance-cosine(A)
no
ise
free
lex
icon
no
isy
le
xicon
25
no
isy
lex
icon
50
no
isy
lex
icon
75
no
isy
lex
icon
100
F-measure
-
7/31/2019 A Knowledge-Based information Extraction Prototype for Data-Rich Documents in the Information Technology Domain
95/171
-
7/31/2019 A Knowledge-Based information Extraction Prototype for Data-Rich Documents in the Information Technology Domain
96/171
0
50,000
100,000
150,000
200,000
250,000
0 0.2 0.4 0.6 0.8 1
threshold
#ofgraphnodes
-
7/31/2019 A Knowledge-Based information Extraction Prototype for Data-Rich Documents in the Information Technology Domain
97/171
-
7/31/2019 A Knowledge-Based information Extraction Prototype for Data-Rich Documents in the Information Technology Domain
98/171
-
7/31/2019 A Knowledge-Based information Extraction Prototype for Data-Rich Documents in the Information Technology Domain
99/171
-
7/31/2019 A Knowledge-Based information Extraction Prototype for Data-Rich Documents in the Information Technology Domain
100/171
-
7/31/2019 A Knowledge-Based information Extraction Prototype for Data-Rich Documents in the Information Technology Domain
101/171
Score =# of wordsin the match
# of characters in theacronym
Score
-
7/31/2019 A Knowledge-Based information Extraction Prototype for Data-Rich Documents in the Information Technology Domain
102/171
bitsacronymmodelbitstextcompressionmodel
= 0.2
-
7/31/2019 A Knowledge-Based information Extraction Prototype for Data-Rich Documents in the Information Technology Domain
103/171
-
7/31/2019 A Knowledge-Based information Extraction Prototype for Data-Rich Documents in the Information Technology Domain
104/171
-
7/31/2019 A Knowledge-Based information Extraction Prototype for Data-Rich Documents in the Information Technology Domain
105/171
-
7/31/2019 A Knowledge-Based information Extraction Prototype for Data-Rich Documents in the Information Technology Domain
106/171
S
Pattern
Document
... .... . . .
Serial
5400 rpm
T
T
A-
A A
A
GB120:DiskHard
-
7/31/2019 A Knowledge-Based information Extraction Prototype for Data-Rich Documents in the Information Technology Domain
107/171
-
7/31/2019 A Knowledge-Based information Extraction Prototype for Data-Rich Documents in the Information Technology Domain
108/171
0
0.2
0.4
0.6
0.8
1
0 0.2 0.4 0.6 0.8 1threshold
F-Measure
Baseline (ExactMatch-SimpleStr.)
Best Configuration
Best Conf. + Acronym Matcher
-
7/31/2019 A Knowledge-Based information Extraction Prototype for Data-Rich Documents in the Information Technology Domain
109/171
-
7/31/2019 A Knowledge-Based information Extraction Prototype for Data-Rich Documents in the Information Technology Domain
110/171
-
7/31/2019 A Knowledge-Based information Extraction Prototype for Data-Rich Documents in the Information Technology Domain
111/171
-
7/31/2019 A Knowledge-Based information Extraction Prototype for Data-Rich Documents in the Information Technology Domain
112/171
-
7/31/2019 A Knowledge-Based information Extraction Prototype for Data-Rich Documents in the Information Technology Domain
113/171
-
7/31/2019 A Knowledge-Based information Extraction Prototype for Data-Rich Documents in the Information Technology Domain
114/171
-
7/31/2019 A Knowledge-Based information Extraction Prototype for Data-Rich Documents in the Information Technology Domain
115/171
-
7/31/2019 A Knowledge-Based information Extraction Prototype for Data-Rich Documents in the Information Technology Domain
116/171
hasparthaspart
haspart
hasparthasparthaspart
haspart
hasparthaspart
hasparthaspart
hasparthaspart
haspart
hasparthaspart
haspart
hasparthaspart
hasparthasparthaspart
haspart
-
7/31/2019 A Knowledge-Based information Extraction Prototype for Data-Rich Documents in the Information Technology Domain
117/171
haspart
hasparthaspart
hasparthaspart
haspart
hasparthaspart
haspart
hasparthaspart
hasparthaspart
hasparthaspart
hasparthaspart
haspart
hasparthaspart
hasparthasparthasparthaspart
hasparthaspart
hasparthaspart
-
7/31/2019 A Knowledge-Based information Extraction Prototype for Data-Rich Documents in the Information Technology Domain
118/171
haspart
hasparthaspart
haspart
hasparthaspart
haspart
hasparthasparthaspart
hasparthaspart
hasparthaspart
haspart
hasparthaspart
hasparthaspart
hasparthaspart
haspart
-
7/31/2019 A Knowledge-Based information Extraction Prototype for Data-Rich Documents in the Information Technology Domain
119/171
haspart
hasparthaspart
hasparthaspart
hasparthaspart
hasparthaspart
hasparthaspart
hasparthaspart
hasparthaspart
hasparthaspart
haspart
hasparthaspart
hasparthaspart
hasparthaspart
haspart
-
7/31/2019 A Knowledge-Based information Extraction Prototype for Data-Rich Documents in the Information Technology Domain
120/171
-
7/31/2019 A Knowledge-Based information Extraction Prototype for Data-Rich Documents in the Information Technology Domain
121/171
haspart
hasparthaspart
haspart
hasparthaspart
haspart
hasparthasparthaspart
hasparthaspart
hasparthaspart
hasparthaspart
isaisa
isahaspart
isaisa
isa
isa
-
7/31/2019 A Knowledge-Based information Extraction Prototype for Data-Rich Documents in the Information Technology Domain
122/171
isa
isaisa
isaisa
isaisa
isa
hasparthaspart
isa
-
7/31/2019 A Knowledge-Based information Extraction Prototype for Data-Rich Documents in the Information Technology Domain
123/171
-
7/31/2019 A Knowledge-Based information Extraction Prototype for Data-Rich Documents in the Information Technology Domain
124/171
-
7/31/2019 A Knowledge-Based information Extraction Prototype for Data-Rich Documents in the Information Technology Domain
125/171
-
7/31/2019 A Knowledge-Based information Extraction Prototype for Data-Rich Documents in the Information Technology Domain
126/171
-
7/31/2019 A Knowledge-Based information Extraction Prototype for Data-Rich Documents in the Information Technology Domain
127/171
-
7/31/2019 A Knowledge-Based information Extraction Prototype for Data-Rich Documents in the Information Technology Domain
128/171
-
7/31/2019 A Knowledge-Based information Extraction Prototype for Data-Rich Documents in the Information Technology Domain
129/171
-
7/31/2019 A Knowledge-Based information Extraction Prototype for Data-Rich Documents in the Information Technology Domain
130/171
-
7/31/2019 A Knowledge-Based information Extraction Prototype for Data-Rich Documents in the Information Technology Domain
131/171
-
7/31/2019 A Knowledge-Based information Extraction Prototype for Data-Rich Documents in the Information Technology Domain
132/171
-
7/31/2019 A Knowledge-Based information Extraction Prototype for Data-Rich Documents in the Information Technology Domain
133/171
-
7/31/2019 A Knowledge-Based information Extraction Prototype for Data-Rich Documents in the Information Technology Domain
134/171
-
7/31/2019 A Knowledge-Based information Extraction Prototype for Data-Rich Documents in the Information Technology Domain
135/171
-
7/31/2019 A Knowledge-Based information Extraction Prototype for Data-Rich Documents in the Information Technology Domain
136/171
-
7/31/2019 A Knowledge-Based information Extraction Prototype for Data-Rich Documents in the Information Technology Domain
137/171
-
7/31/2019 A Knowledge-Based information Extraction Prototype for Data-Rich Documents in the Information Technology Domain
138/171
-
7/31/2019 A Knowledge-Based information Extraction Prototype for Data-Rich Documents in the Information Technology Domain
139/171
-
7/31/2019 A Knowledge-Based information Extraction Prototype for Data-Rich Documents in the Information Technology Domain
140/171
-
7/31/2019 A Knowledge-Based information Extraction Prototype for Data-Rich Documents in the Information Technology Domain
141/171
-
7/31/2019 A Knowledge-Based information Extraction Prototype for Data-Rich Documents in the Information Technology Domain
142/171
-
7/31/2019 A Knowledge-Based information Extraction Prototype for Data-Rich Documents in the Information Technology Domain
143/171
-
7/31/2019 A Knowledge-Based information Extraction Prototype for Data-Rich Documents in the Information Technology Domain
144/171
-
7/31/2019 A Knowledge-Based information Extraction Prototype for Data-Rich Documents in the Information Technology Domain
145/171
-
7/31/2019 A Knowledge-Based information Extraction Prototype for Data-Rich Documents in the Information Technology Domain
146/171
-
7/31/2019 A Knowledge-Based information Extraction Prototype for Data-Rich Documents in the Information Technology Domain
147/171
-
7/31/2019 A Knowledge-Based information Extraction Prototype for Data-Rich Documents in the Information Technology Domain
148/171
-
7/31/2019 A Knowledge-Based information Extraction Prototype for Data-Rich Documents in the Information Technology Domain
149/171
-
7/31/2019 A Knowledge-Based information Extraction Prototype for Data-Rich Documents in the Information Technology Domain
150/171
-
7/31/2019 A Knowledge-Based information Extraction Prototype for Data-Rich Documents in the Information Technology Domain
151/171
-
7/31/2019 A Knowledge-Based information Extraction Prototype for Data-Rich Documents in the Information Technology Domain
152/171
-
7/31/2019 A Knowledge-Based information Extraction Prototype for Data-Rich Documents in the Information Technology Domain
153/171
-
7/31/2019 A Knowledge-Based information Extraction Prototype for Data-Rich Documents in the Information Technology Domain
154/171
-
7/31/2019 A Knowledge-Based information Extraction Prototype for Data-Rich Documents in the Information Technology Domain
155/171
-
7/31/2019 A Knowledge-Based information Extraction Prototype for Data-Rich Documents in the Information Technology Domain
156/171
-
7/31/2019 A Knowledge-Based information Extraction Prototype for Data-Rich Documents in the Information Technology Domain
157/171
-
7/31/2019 A Knowledge-Based information Extraction Prototype for Data-Rich Documents in the Information Technology Domain
158/171
-
7/31/2019 A Knowledge-Based information Extraction Prototype for Data-Rich Documents in the Information Technology Domain
159/171
-
7/31/2019 A Knowledge-Based information Extraction Prototype for Data-Rich Documents in the Information Technology Domain
160/171
-
7/31/2019 A Knowledge-Based information Extraction Prototype for Data-Rich Documents in the Information Technology Domain
161/171
-
7/31/2019 A Knowledge-Based information Extraction Prototype for Data-Rich Documents in the Information Technology Domain
162/171
-
7/31/2019 A Knowledge-Based information Extraction Prototype for Data-Rich Documents in the Information Technology Domain
163/171
-
7/31/2019 A Knowledge-Based information Extraction Prototype for Data-Rich Documents in the Information Technology Domain
164/171
-
7/31/2019 A Knowledge-Based information Extraction Prototype for Data-Rich Documents in the Information Technology Domain
165/171
-
7/31/2019 A Knowledge-Based information Extraction Prototype for Data-Rich Documents in the Information Technology Domain
166/171
-
7/31/2019 A Knowledge-Based information Extraction Prototype for Data-Rich Documents in the Information Technology Domain
167/171
-
7/31/2019 A Knowledge-Based information Extraction Prototype for Data-Rich Documents in the Information Technology Domain
168/171
-
7/31/2019 A Knowledge-Based information Extraction Prototype for Data-Rich Documents in the Information Technology Domain
169/171
-
7/31/2019 A Knowledge-Based information Extraction Prototype for Data-Rich Documents in the Information Technology Domain
170/171
-
7/31/2019 A Knowledge-Based information Extraction Prototype for Data-Rich Documents in the Information Technology Domain
171/171