TEMPLATE-DRIVEN KNOWLEDGE MINING. KNOWLEDGE PROSPECTOR.NET Project team (Knowledge.Net) Anton V....
-
Upload
kellie-turner -
Category
Documents
-
view
213 -
download
0
Transcript of TEMPLATE-DRIVEN KNOWLEDGE MINING. KNOWLEDGE PROSPECTOR.NET Project team (Knowledge.Net) Anton V....
![Page 1: TEMPLATE-DRIVEN KNOWLEDGE MINING. KNOWLEDGE PROSPECTOR.NET Project team (Knowledge.Net) Anton V. Novikov Maxim V. Sigalin Alexey L. Smolyakov Dmitry G.](https://reader036.fdocuments.in/reader036/viewer/2022082820/56649ed95503460f94be865b/html5/thumbnails/1.jpg)
TEMPLATE-DRIVEN KNOWLEDGE MINING.
KNOWLEDGE PROSPECTOR.NET
Project team (Knowledge.Net)Anton V. NovikovMaxim V. Sigalin Alexey L. SmolyakovDmitry G. Cherepanov
Saint-Petersburg State University
SpeakerAlexey L. Smolyakov
Scientific Adviserprof. Vladimir V. Safonov
![Page 2: TEMPLATE-DRIVEN KNOWLEDGE MINING. KNOWLEDGE PROSPECTOR.NET Project team (Knowledge.Net) Anton V. Novikov Maxim V. Sigalin Alexey L. Smolyakov Dmitry G.](https://reader036.fdocuments.in/reader036/viewer/2022082820/56649ed95503460f94be865b/html5/thumbnails/2.jpg)
Project goals
Flexible framework Supporting different languages Integration with Knowledge.Net
![Page 3: TEMPLATE-DRIVEN KNOWLEDGE MINING. KNOWLEDGE PROSPECTOR.NET Project team (Knowledge.Net) Anton V. Novikov Maxim V. Sigalin Alexey L. Smolyakov Dmitry G.](https://reader036.fdocuments.in/reader036/viewer/2022082820/56649ed95503460f94be865b/html5/thumbnails/3.jpg)
Algorithm
Getting documents and first-step text analysis Morphological analysis of text blocks Semantic analysis of entities sets using templates Optimizing resulting graph Saving results
![Page 4: TEMPLATE-DRIVEN KNOWLEDGE MINING. KNOWLEDGE PROSPECTOR.NET Project team (Knowledge.Net) Anton V. Novikov Maxim V. Sigalin Alexey L. Smolyakov Dmitry G.](https://reader036.fdocuments.in/reader036/viewer/2022082820/56649ed95503460f94be865b/html5/thumbnails/4.jpg)
Getting documents and first-step text analysis
Getting documents from providers
Divide document into articles (just text, list, table etc.)
Divide text into blocks
…
Текстовый формат – этоочень гибкий путь для описания различных типов информации…
1) Один2) Два3) Три
Страна. Столица.Англия. Лондон.Украина. Киев.
![Page 5: TEMPLATE-DRIVEN KNOWLEDGE MINING. KNOWLEDGE PROSPECTOR.NET Project team (Knowledge.Net) Anton V. Novikov Maxim V. Sigalin Alexey L. Smolyakov Dmitry G.](https://reader036.fdocuments.in/reader036/viewer/2022082820/56649ed95503460f94be865b/html5/thumbnails/5.jpg)
Morphological analysis of text blocks
Language recognition
Morphological form recognition using dictionaries
Creating entities
Word(«Documents»)
«Documents» current m. f. :Noun, plural«Document» base m. f.:Noun, singular
Russian English …
MRD XML …
Entity Class(«Document»)
![Page 6: TEMPLATE-DRIVEN KNOWLEDGE MINING. KNOWLEDGE PROSPECTOR.NET Project team (Knowledge.Net) Anton V. Novikov Maxim V. Sigalin Alexey L. Smolyakov Dmitry G.](https://reader036.fdocuments.in/reader036/viewer/2022082820/56649ed95503460f94be865b/html5/thumbnails/6.jpg)
Morphological analysis >Entities types >“Simple” entities Entity “separator". Example «.,;:!?()[]{}
…» Entity “unknown" Entity “changeable". Example «good» Entity “relationship". Example «Planet
Earth is LESS then Sun»
![Page 7: TEMPLATE-DRIVEN KNOWLEDGE MINING. KNOWLEDGE PROSPECTOR.NET Project team (Knowledge.Net) Anton V. Novikov Maxim V. Sigalin Alexey L. Smolyakov Dmitry G.](https://reader036.fdocuments.in/reader036/viewer/2022082820/56649ed95503460f94be865b/html5/thumbnails/7.jpg)
Morphological analysis > Entities types >“True” entities Entity “class" (class). Example
«document». Entity “property". Example «useful». Entity “datatype".
Datetime Integer
![Page 8: TEMPLATE-DRIVEN KNOWLEDGE MINING. KNOWLEDGE PROSPECTOR.NET Project team (Knowledge.Net) Anton V. Novikov Maxim V. Sigalin Alexey L. Smolyakov Dmitry G.](https://reader036.fdocuments.in/reader036/viewer/2022082820/56649ed95503460f94be865b/html5/thumbnails/8.jpg)
Semantic analysis >Goals
Creating relationships between entities
Creating new entities Adding true entities
into resulting graphProperty(«comfortable»)
Class(«house»)
Class(«building»)
Property(«brick»)
Subclass
Property-Class
Property-Class
![Page 9: TEMPLATE-DRIVEN KNOWLEDGE MINING. KNOWLEDGE PROSPECTOR.NET Project team (Knowledge.Net) Anton V. Novikov Maxim V. Sigalin Alexey L. Smolyakov Dmitry G.](https://reader036.fdocuments.in/reader036/viewer/2022082820/56649ed95503460f94be865b/html5/thumbnails/9.jpg)
Semantic analysis >Relationship types Relationship between property and class Relationship “subclass” Relationship “subproperty” Relationship “equality” Relationship between two classes Relationship “conditional rule”
![Page 10: TEMPLATE-DRIVEN KNOWLEDGE MINING. KNOWLEDGE PROSPECTOR.NET Project team (Knowledge.Net) Anton V. Novikov Maxim V. Sigalin Alexey L. Smolyakov Dmitry G.](https://reader036.fdocuments.in/reader036/viewer/2022082820/56649ed95503460f94be865b/html5/thumbnails/10.jpg)
Semantic analysis >Template description Priority Pattern Handlers
<Template Priority="10000" Pattern="#E.P #E.C ,? а? значить #E.P"><Handler Name=“PropertyRelationship" Arguments="0, 1" /><Handler Name="PropertyRelationship" Arguments="5, 1" /><Handler Name="ConditionalRule" Arguments="1, 0, 5" />
</Template>
![Page 11: TEMPLATE-DRIVEN KNOWLEDGE MINING. KNOWLEDGE PROSPECTOR.NET Project team (Knowledge.Net) Anton V. Novikov Maxim V. Sigalin Alexey L. Smolyakov Dmitry G.](https://reader036.fdocuments.in/reader036/viewer/2022082820/56649ed95503460f94be865b/html5/thumbnails/11.jpg)
Semantic analysis >Pattern description Logical operands: «&»(and), «|»(or), «^»(not). Occurrence: not set (once), «+», «*», «?» #E.P, #E.C, #E.S, #E.U, #E.Int, #E.DateTime #M.Noun, #M.Adjective, #M.Verb, … #W.Month, #W.Number, … - words holder #H.Class, …- clauses holder
[#E.P #M.Adjective]+ [#E.C #M.Noun]
![Page 12: TEMPLATE-DRIVEN KNOWLEDGE MINING. KNOWLEDGE PROSPECTOR.NET Project team (Knowledge.Net) Anton V. Novikov Maxim V. Sigalin Alexey L. Smolyakov Dmitry G.](https://reader036.fdocuments.in/reader036/viewer/2022082820/56649ed95503460f94be865b/html5/thumbnails/12.jpg)
Semantic analysis >Pattern description > Words holder
<ClauseHolder Name="Class"><Item Pattern="[#E.P #M.Adjective]* #E.C" Index="1" /><Item Pattern="[#E.P #M.Adjective] , [#E.P #M.Adjective] #E.C" Index="2" />
</ClauseHolder>
Clauses holder
<WordHolder Name="Month"><Item Word=“JANUARY" Value="1" /><Item Word=“FEBRUARY" Value="2" /><Item Word=“MARCH" Value="3" />...
</WordHolder>
![Page 13: TEMPLATE-DRIVEN KNOWLEDGE MINING. KNOWLEDGE PROSPECTOR.NET Project team (Knowledge.Net) Anton V. Novikov Maxim V. Sigalin Alexey L. Smolyakov Dmitry G.](https://reader036.fdocuments.in/reader036/viewer/2022082820/56649ed95503460f94be865b/html5/thumbnails/13.jpg)
Semantic analysis >Handlers
Replace Create datetime entity Create «property-class» relationship Create «subclass» relationship Create «subproperty» relationship Create «conditional rule» relationship Create «class-class» relationship
![Page 14: TEMPLATE-DRIVEN KNOWLEDGE MINING. KNOWLEDGE PROSPECTOR.NET Project team (Knowledge.Net) Anton V. Novikov Maxim V. Sigalin Alexey L. Smolyakov Dmitry G.](https://reader036.fdocuments.in/reader036/viewer/2022082820/56649ed95503460f94be865b/html5/thumbnails/14.jpg)
Semantic analysis >Creating relationships
Property(«useful») Class(«document»)
+
<Template Priority=“4" Pattern="[#E.P #M.Adjective]+ [#E.C #M.Noun]"><Handler Name=“PropertyRelationship" Arguments="0, 1" />
</Template>
=
Property(«useful») Class(«document»)
«property-class» relationship
![Page 15: TEMPLATE-DRIVEN KNOWLEDGE MINING. KNOWLEDGE PROSPECTOR.NET Project team (Knowledge.Net) Anton V. Novikov Maxim V. Sigalin Alexey L. Smolyakov Dmitry G.](https://reader036.fdocuments.in/reader036/viewer/2022082820/56649ed95503460f94be865b/html5/thumbnails/15.jpg)
Semantic analysis >Creating new entities
Integer(«7») Class(«December»)
+<Template Priority="11000" Pattern="#E.INT #W.Month #E.INT year">
<Handler Name="Replace" From="0" Count="4" ><CreateEntityHandler Name="CreateDateTime«
Arguments="day=0, month=1, year=2" /></Handler>
</Template>
=
Datetime (7.12.2006)
Integer(«2006») Class(«Year»)
![Page 16: TEMPLATE-DRIVEN KNOWLEDGE MINING. KNOWLEDGE PROSPECTOR.NET Project team (Knowledge.Net) Anton V. Novikov Maxim V. Sigalin Alexey L. Smolyakov Dmitry G.](https://reader036.fdocuments.in/reader036/viewer/2022082820/56649ed95503460f94be865b/html5/thumbnails/16.jpg)
Optimizing resulting graph
Removing redundant «subclass» relationships
Removing redundant relationships between properties and classesClass(«bus»)
Class(«transport») Property(«fast»)
subclass Property-class
Class(«vehicle»)
SubclassSubclass Property-class
![Page 17: TEMPLATE-DRIVEN KNOWLEDGE MINING. KNOWLEDGE PROSPECTOR.NET Project team (Knowledge.Net) Anton V. Novikov Maxim V. Sigalin Alexey L. Smolyakov Dmitry G.](https://reader036.fdocuments.in/reader036/viewer/2022082820/56649ed95503460f94be865b/html5/thumbnails/17.jpg)
Saving results
Saving acquired knowledge into Knowledge.Net format
Into OWL Saving (and loading) knowledge from
own binary format files
![Page 18: TEMPLATE-DRIVEN KNOWLEDGE MINING. KNOWLEDGE PROSPECTOR.NET Project team (Knowledge.Net) Anton V. Novikov Maxim V. Sigalin Alexey L. Smolyakov Dmitry G.](https://reader036.fdocuments.in/reader036/viewer/2022082820/56649ed95503460f94be865b/html5/thumbnails/18.jpg)
Current project status
Developed working prototype Created test temples Attached «Mrd» dictionary (Russian and
English)
![Page 19: TEMPLATE-DRIVEN KNOWLEDGE MINING. KNOWLEDGE PROSPECTOR.NET Project team (Knowledge.Net) Anton V. Novikov Maxim V. Sigalin Alexey L. Smolyakov Dmitry G.](https://reader036.fdocuments.in/reader036/viewer/2022082820/56649ed95503460f94be865b/html5/thumbnails/19.jpg)
Plans
Support creating «compound» entities (compound from several words: «creation of human hands»)
Functionality extension (adding new entities, relationships, templates, handlers, …)
Program for generating templates Developing good examples
![Page 20: TEMPLATE-DRIVEN KNOWLEDGE MINING. KNOWLEDGE PROSPECTOR.NET Project team (Knowledge.Net) Anton V. Novikov Maxim V. Sigalin Alexey L. Smolyakov Dmitry G.](https://reader036.fdocuments.in/reader036/viewer/2022082820/56649ed95503460f94be865b/html5/thumbnails/20.jpg)
?Contact information:[email protected]://www.knowledge-net.ruhttp://polyhimnie.math.spbu.ru