FRAMEWORK FOR INTELLIGENT VIRTUAL ORGANIZATIONS (FIVO)
Natural Language basedProcessing of Multilingual Contracts
for Virtual Organizations constitution
Mikołaj Pastuszko, Bartosz Kryza, Renata Słota, Jacek Kitowski
Institute of Computer Science, University of Science and Technology AGHKraków, POLAND
Agenda
Background of the problem
Goals and requirements of NLPN system
Architecture of NLPN system
Main processing flow in NLPN system
Technologies and tools used in NLPN system
Example of contract text analysis in NLPN system
Future development proposals for NLPN system
Problem introduction
Assumption Organizations own resources that are expected to be shared within Virtual
Organization Conditions of cooperation are written down in form of the contract
document
Problem Contracts are written in natural language (e.g. Polish) Automatization of the Virtual Organization management (FiVO) requires a
formal and semantic form of the contract (ontology in OWL format)
Solution NLP-based Negotiations (NLPN) System:
Translating natural language based contracts to ontologies in OWL format
Concept of NLPN system
Goals and requirements
Support for multiple languages English and Polish as a starting point Easily extendable with support for another languages
Output ontology in OWL format (FiVO requirement)
Ontology sturucture easily adjustable
Minimalization of human (supervisor) assistance
Flexible mapping between text phrases and ontology entities Human-readable and easily editable Contract Dictionary
Modularity Easy orchestration for various applications
Data flow in NLPN system
Modular architecture of NLPN system
Contract text analysis
1. Tokenization
2. Sentence Splitting
3. Morphological Analysis and POS Tagging
4. Named Entities Recognition● Gazetteer
5. Contract Statemets Recognition● Transducer + grammars
Technologies and tools
NLP tools GATE – General Architecture for Text Engineering
Tokenizer Gazetteer OntoGazetteer JAPE Transducer
JAPE grammars – Java Annotations Pattern Engine
LanguageTool Sentence Splitter Part-of-Speech Tagger Disambiguator (tagger part)
Supports 20 languages including Polish (Morfologik library)
ANNIE – A Nearly-NewInformation Extraction System
Technologies and tools
Ontologies Jena Semantic Web Framework library
Supports read and write in RDF/XML, N3 and N-Triples formats Provides API for OWL and RDF
Configuration files YAML format SnakeYAML library
Example: Contract text analysis
Stwierdzenia QoS
Costa Rica Airlines będzie świadczyć ilość miejsc siedzących dla Mercedes-Benz H6 wynoszącą dokładnie 54 i przewidywaną prędkość średnią ponad 60 km/h.
Stwierdzenia bezpieczeństwa
Tour Manager i Klient powinni być uprawnieni do rezerwowania miejsc poprzez Usługę Costa Rica.
Klauzule kar umownych
W przypadku niedotrzymania warunków świadczenia Acela D45 trainset powinno zostać wysłane powiadomienie do Johna Smitha.
QoS Statements
Costa Rica Airlines should provide number of seats of Mercedes-Benz H6 equal to 54 and expected average velocity greater than 60 km/h.
Security Statements
Tour Manager and Client should be able to book seats on Costa Rica Service.
Penalty Clauses
In case of violation of Acela D45 trainset sharing conditions a notification should be sent to John Smith.
Tokenization
Sentence Splitting
Morphological Analysis and POS Tagging
Named Entities Recognition
Contract Statements Recognition
Contract Statements Recognition
Summary
NLPN system: Translates natural language based contracts to formal and
semantic form of ontologies Supports English and Polish
Easily extendable with another languages Is modular
Ease of use in various applications Is highly configurable
Contract Dictionary (including its structure) Contract Ontology structure Contract Statements forms Configuration files for all components
Has broad perspectives for future development →
Future development
Distributed Negotiations Environment NegotiationsConsole
Morestatementforms
Statisticapproachalgorithms
Noisecorrection(typo etc.)
Top Related