126/10/2008 SWESE'08
Enhanced Semantic Access to Software Artefacts
Danica Damljanović and Kalina Bontcheva
2
University of Sheffield NLP
26/10/2008 SWESE'08
Outline
Motivation
The GATE case study
Semantic-based prototypeData collection
Automatic content augmentation
Storing implicit annotations
Querying using text-based queries
Example
Conclusion and Future work
3
University of Sheffield NLP
26/10/2008 SWESE'08
Motivation
Large software frameworks:hard to maintain: never enough documentation
hard to find specific information
significant learning curve for new developers working on software extensions
software engineers who integrate relevant parts into their applications
4
University of Sheffield NLP
26/10/2008 SWESE'08
Can semantic technologies help?
forumpost
Website
sourcecode
forumpost
forumpost
Website
paper
paper
sourcecode
forumpost
forumpost
Website
Website
sourcecode
sourcecode
Software documentation
5
University of Sheffield NLP
26/10/2008 SWESE'08
The GATE case studyGATE (gate.ac.uk):
open-source, General Architecture for Text Engineering
development team over 15 people at present, over 30 over the years
documentation about GATE software:
dispersed on the Web: not easy to find by new/existing developers/users
no unified interface: Google, gate.ac.uk, gmane mailing list search, etc.
6
University of Sheffield NLP
26/10/2008 SWESE'08
The GATE case study: requirements
Automatic generation of reference pages from the ontology:
provide users with a single point of access to all knowledge, continuously kept up to date.
generate automatically a web page:shown on its own or alongside the ontology tree, where searched concept is selected
7
University of Sheffield NLP
26/10/2008 SWESE'08
Semantic-based prototype
Software documentationlearn domain ontology
annotate content
Semantic repository
store
text-based query
9
University of Sheffield NLP
26/10/2008 SWESE'08
Data collection
Downloaded around 10000 software artefacts about GATE:
source code,source documentation, GATE manual, forum posts, publications.
10
University of Sheffield NLP
26/10/2008 SWESE'08
Annotate content
11
University of Sheffield NLP
26/10/2008 SWESE'08
Export annotations
Merge
document metadata and
annotations
into the owl file using an information-extraction ontology:
PROTON KM (http://proton.semanticweb.org/2005/04/protonkm)
12
University of Sheffield NLP
26/10/2008 SWESE'08
Information-extraction ontology
Document class
resourceType property: refers to the type of the document,
informationResourceIdentifier property: refers to the URL of the annotated document.
Mention class:
occursIn Document
hasStartOffset and hasEndOffset: storing position of the annotation
(new) refersAnything: to preserve the URI of the resource to which the mention is referring to
13
University of Sheffield NLP
26/10/2008 SWESE'08
Export annotations
16
University of Sheffield NLP
26/10/2008 SWESE'08
Access knowledge using text-based queries
QuestIO (Question-based interface to ontologies):
keyword-based queriesfull-blown questions
University of Sheffield NLP
SWESE'08
QuestIO:Text-based query >> SeRQL
select c0,"[inverseProperty]", p1, c2,"[inverseProperty]", p3, c4,"[inverseProperty]", p5, i6
from {c0} rdf:type {<http://gate.ac.uk/ns/gate-ontology#JavaClass>}, {c2} p1 {c0}, {c2} rdf:type {<http://gate.ac.uk/ns/gate-ontology#ResourceParameter>}, {c4} p3 {c2}, {c4} rdf:type {<http://gate.ac.uk/ns/gate-ontology#ProcessingResource>}, {i6} p5 {c4}, {i6} rdf:type {<http://gate.ac.uk/ns/gate-ontology#GATEPlugin>}
where p1=http://gate.ac.uk/ns/gate-ontology#parameterHasType and p3=http://gate.ac.uk/ns/gate-ontology#hasRunTimeParameter and p5=http://gate.ac.uk/ns/gate-ontology#containsResource and i6=<http://gate.ac.uk/ns/gate-ontology#annic>
“Java Class for parameters for processing resources in ANNIC?”
18
University of Sheffield NLP
26/10/2008 SWESE'08
An example
19
University of Sheffield NLP
26/10/2008 SWESE'08
Demo
http://gate.ac.uk/document-search
22
University of Sheffield NLP
26/10/2008 SWESE'08
Future Work
optimise query execution time: migrate from SeRQL >> SPARQL
include simple ontology-driven data in the interface
evaluation to follow:user-centric evaluation with GATE users
23
University of Sheffield NLP
26/10/2008 SWESE'08
Thank you!
Questions?
Top Related