Language Resources and Linked...

69
20/11/14 1 Presenter name Language Resources and Linked Data Integrating NLP with Linked Data: the NIF Format @EKAW 2014 November 24-28, 2014, Linkoping, Sweden Milan Dojchinovski Web Intelligence Research Group Faculty of Information Technology Czech Technical University in Prague [email protected] - @m1ci - http://dojchinovski.mk

Transcript of Language Resources and Linked...

Page 1: Language Resources and Linked Datalider-project.eu/sites/default/files/ekaw14/EKAW2014-tutorial-nif-complete...EKAW 2014 tutorial Outline 1. Introduction – NIF Basics – NIF Corpora

20/11/14 1 Presenter name

Language Resources and Linked Data Integrating NLP with Linked Data: the NIF Format

@EKAW 2014 November 24-28, 2014, Linkoping, Sweden

Milan Dojchinovski

Web Intelligence Research Group Faculty of Information Technology Czech Technical University in Prague

[email protected] - @m1ci - http://dojchinovski.mk

Page 2: Language Resources and Linked Datalider-project.eu/sites/default/files/ekaw14/EKAW2014-tutorial-nif-complete...EKAW 2014 tutorial Outline 1. Introduction – NIF Basics – NIF Corpora

2 Language Resources and Linked Data

EKAW 2014 tutorial

Outline

1.  Introduction –  NIF Basics –  NIF Corpora –  NIF Tools and Services

2.  Hands-on: NIF in action –  How to annotate strings –  How to query generated NIF and existing

corpora

Page 3: Language Resources and Linked Datalider-project.eu/sites/default/files/ekaw14/EKAW2014-tutorial-nif-complete...EKAW 2014 tutorial Outline 1. Introduction – NIF Basics – NIF Corpora

3 Language Resources and Linked Data

EKAW 2014 tutorial

Introduction – Bird’s View

Page 4: Language Resources and Linked Datalider-project.eu/sites/default/files/ekaw14/EKAW2014-tutorial-nif-complete...EKAW 2014 tutorial Outline 1. Introduction – NIF Basics – NIF Corpora

4 Language Resources and Linked Data

EKAW 2014 tutorial

LOD-aware NLP Services

•  Not only data, but also LOD-aware services using: –  Lexica and dictionaries (lemon model) –  Training data for NLP in RDF (NIF model) –  Service metadata descriptions in RDF –  Combination with real world facts (i.e. DBpedia or

GeoNames) •  Long term goal(s):

–  Index of tools and data –  Easily produce ready-made, preconfigured NLP

services and pipelines –  freemium /pay-per-use business models

Page 5: Language Resources and Linked Datalider-project.eu/sites/default/files/ekaw14/EKAW2014-tutorial-nif-complete...EKAW 2014 tutorial Outline 1. Introduction – NIF Basics – NIF Corpora

5 Language Resources and Linked Data

EKAW 2014 tutorial

NLP2RDF Project

•  Maintained under http://nlp2rdf.org •  Realize the long term goal(s) •  Maintain and consolidate results from short-term

projects •  Bootstrap the eco-system

Page 6: Language Resources and Linked Datalider-project.eu/sites/default/files/ekaw14/EKAW2014-tutorial-nif-complete...EKAW 2014 tutorial Outline 1. Introduction – NIF Basics – NIF Corpora

6 Language Resources and Linked Data

EKAW 2014 tutorial

NLP Interchange Format

•  The NLP Interchange Format (NIF) is an RDF/OWL-based format that aims to achieve interoperability between NLP tools, language annotations and annotations.

Page 7: Language Resources and Linked Datalider-project.eu/sites/default/files/ekaw14/EKAW2014-tutorial-nif-complete...EKAW 2014 tutorial Outline 1. Introduction – NIF Basics – NIF Corpora

7 Language Resources and Linked Data

EKAW 2014 tutorial

NIF in a Nutshell

•  Way to mint URIs for arbitrary strings and content documents on the Web

•  Logical formalisation of strings and annotations via an ontology

•  Quick and easy format •  Build on existing standards (RDF, LAF/GrAF, RFC

5147) •  Reuse of RDF tools and implementations •  Decrease development costs for integration

Page 8: Language Resources and Linked Datalider-project.eu/sites/default/files/ekaw14/EKAW2014-tutorial-nif-complete...EKAW 2014 tutorial Outline 1. Introduction – NIF Basics – NIF Corpora

8 Language Resources and Linked Data

EKAW 2014 tutorial

Motivation

•  Developers nightmare –  Many NLP tools fulfill similar functions but are not

interoperable –  Heterogeneous output formats (JSON, XML) –  NLP Web services with heterogeneous API parameters –  Heterogeneous way of annotating text

•  HTML markup removed – offsets not usable •  Use of byte offset instead of char offset

Page 9: Language Resources and Linked Datalider-project.eu/sites/default/files/ekaw14/EKAW2014-tutorial-nif-complete...EKAW 2014 tutorial Outline 1. Introduction – NIF Basics – NIF Corpora

9 Language Resources and Linked Data

EKAW 2014 tutorial

Outline

1.  Introduction –  NIF Basics –  NIF Corpora –  NIF Tools and Services

2.  Hands-on: NIF in action –  How to annotate strings –  How to query generated NIF and existing

corpora

Page 10: Language Resources and Linked Datalider-project.eu/sites/default/files/ekaw14/EKAW2014-tutorial-nif-complete...EKAW 2014 tutorial Outline 1. Introduction – NIF Basics – NIF Corpora

10 Language Resources and Linked Data

EKAW 2014 tutorial

NLP tool NLP tool

NLP tool NLP tool

NLP toolNLP tool

WTF! Spaghetti ?!!

Pre-NIF Spaghetti Architecture

•  Need for integration –  One-to-one integration –  Hard to maintain

Page 11: Language Resources and Linked Datalider-project.eu/sites/default/files/ekaw14/EKAW2014-tutorial-nif-complete...EKAW 2014 tutorial Outline 1. Introduction – NIF Basics – NIF Corpora

11 Language Resources and Linked Data

EKAW 2014 tutorial

NIF Architecture

NIF wrapper

NIF wrapper

NIF wrapper

NIF wrapper

NIF wrapper

NLP tool NLP tool

NLP tool NLP tool

NLP toolNLP tool

NIF wrapper NIFInteroperability layers:

● Structural● Conceptual● Access

Cross-Linking Background Knowledge

Query Federation

HTTP/REST

HTTP/REST

HTTP/REST

HTTP/REST

HTTP/REST

HTTP/REST

Page 12: Language Resources and Linked Datalider-project.eu/sites/default/files/ekaw14/EKAW2014-tutorial-nif-complete...EKAW 2014 tutorial Outline 1. Introduction – NIF Basics – NIF Corpora

12 Language Resources and Linked Data

EKAW 2014 tutorial

NIF Annotations

Page 13: Language Resources and Linked Datalider-project.eu/sites/default/files/ekaw14/EKAW2014-tutorial-nif-complete...EKAW 2014 tutorial Outline 1. Introduction – NIF Basics – NIF Corpora

13 Language Resources and Linked Data

EKAW 2014 tutorial

NIF Annotations (cont.)

Page 14: Language Resources and Linked Datalider-project.eu/sites/default/files/ekaw14/EKAW2014-tutorial-nif-complete...EKAW 2014 tutorial Outline 1. Introduction – NIF Basics – NIF Corpora

14 Language Resources and Linked Data

EKAW 2014 tutorial

NIF Annotations (cont.)

Page 15: Language Resources and Linked Datalider-project.eu/sites/default/files/ekaw14/EKAW2014-tutorial-nif-complete...EKAW 2014 tutorial Outline 1. Introduction – NIF Basics – NIF Corpora

15 Language Resources and Linked Data

EKAW 2014 tutorial

NIF Annotations (cont.)

Page 16: Language Resources and Linked Datalider-project.eu/sites/default/files/ekaw14/EKAW2014-tutorial-nif-complete...EKAW 2014 tutorial Outline 1. Introduction – NIF Basics – NIF Corpora

16 Language Resources and Linked Data

EKAW 2014 tutorial

Example: Tripadvisor Corpus

•  Contains hotel reviews and review metadata

•  1760 semi-structured files •  Every file’s content becomes nif:Context

resource •  Strings addressed with unique URIs

Page 17: Language Resources and Linked Datalider-project.eu/sites/default/files/ekaw14/EKAW2014-tutorial-nif-complete...EKAW 2014 tutorial Outline 1. Introduction – NIF Basics – NIF Corpora

17 Language Resources and Linked Data

EKAW 2014 tutorial

Context

•  Address the content of the document •  nif:isString contains document content •  In NIF the document != content of the document •  Two documents can have the same content, BUT must not

have the same URI

Page 18: Language Resources and Linked Datalider-project.eu/sites/default/files/ekaw14/EKAW2014-tutorial-nif-complete...EKAW 2014 tutorial Outline 1. Introduction – NIF Basics – NIF Corpora

18 Language Resources and Linked Data

EKAW 2014 tutorial

Other Strings

•  Address arbitrary strings in the document •  Use string offsets in relation to context to address •  nif:anchorOf contains the string •  Additional properties can be added

a tripadvisor:Rivew

Page 19: Language Resources and Linked Datalider-project.eu/sites/default/files/ekaw14/EKAW2014-tutorial-nif-complete...EKAW 2014 tutorial Outline 1. Introduction – NIF Basics – NIF Corpora

19 Language Resources and Linked Data

EKAW 2014 tutorial

Words and Phrases

•  Sentiment values, POS tags and other annotations can be added to the words and phrases

Page 20: Language Resources and Linked Datalider-project.eu/sites/default/files/ekaw14/EKAW2014-tutorial-nif-complete...EKAW 2014 tutorial Outline 1. Introduction – NIF Basics – NIF Corpora

20 Language Resources and Linked Data

EKAW 2014 tutorial

Offsets Counting

begin: 0end: 2anchor: “My”

begin: 3end: 6anchor: “dog”

begin: 7end: 11anchor: “has”

begin: 11end: 16anchor: “fleas”

10 1 2 3 4 5 6 7 8 9 0 1 2 3 4 5 6 7|M|y| |d|o|g| |h|a|s| |f|l|e|a|s|.|

Page 21: Language Resources and Linked Datalider-project.eu/sites/default/files/ekaw14/EKAW2014-tutorial-nif-complete...EKAW 2014 tutorial Outline 1. Introduction – NIF Basics – NIF Corpora

21 Language Resources and Linked Data

EKAW 2014 tutorial

NIF Ontology

Page 22: Language Resources and Linked Datalider-project.eu/sites/default/files/ekaw14/EKAW2014-tutorial-nif-complete...EKAW 2014 tutorial Outline 1. Introduction – NIF Basics – NIF Corpora

22 Language Resources and Linked Data

EKAW 2014 tutorial

NIF Ontology

Page 23: Language Resources and Linked Datalider-project.eu/sites/default/files/ekaw14/EKAW2014-tutorial-nif-complete...EKAW 2014 tutorial Outline 1. Introduction – NIF Basics – NIF Corpora

23 Language Resources and Linked Data

EKAW 2014 tutorial

NIF Ontology

Page 24: Language Resources and Linked Datalider-project.eu/sites/default/files/ekaw14/EKAW2014-tutorial-nif-complete...EKAW 2014 tutorial Outline 1. Introduction – NIF Basics – NIF Corpora

24 Language Resources and Linked Data

EKAW 2014 tutorial

NIF Ontology

Page 25: Language Resources and Linked Datalider-project.eu/sites/default/files/ekaw14/EKAW2014-tutorial-nif-complete...EKAW 2014 tutorial Outline 1. Introduction – NIF Basics – NIF Corpora

25 Language Resources and Linked Data

EKAW 2014 tutorial

NIF Ontology

Page 26: Language Resources and Linked Datalider-project.eu/sites/default/files/ekaw14/EKAW2014-tutorial-nif-complete...EKAW 2014 tutorial Outline 1. Introduction – NIF Basics – NIF Corpora

26 Language Resources and Linked Data

EKAW 2014 tutorial

NIF Combinator Scheme

Page 27: Language Resources and Linked Datalider-project.eu/sites/default/files/ekaw14/EKAW2014-tutorial-nif-complete...EKAW 2014 tutorial Outline 1. Introduction – NIF Basics – NIF Corpora

27 Language Resources and Linked Data

EKAW 2014 tutorial

Demo: http://nlp2rdf.aksw.org/

Page 28: Language Resources and Linked Datalider-project.eu/sites/default/files/ekaw14/EKAW2014-tutorial-nif-complete...EKAW 2014 tutorial Outline 1. Introduction – NIF Basics – NIF Corpora

28 Language Resources and Linked Data

EKAW 2014 tutorial

Outline

1.  Introduction –  NIF Basics –  NIF Corpora –  NIF Tools and Services

2.  Hands-on: NIF in action –  How to annotate strings –  How to query generated NIF and existing

corpora

Page 29: Language Resources and Linked Datalider-project.eu/sites/default/files/ekaw14/EKAW2014-tutorial-nif-complete...EKAW 2014 tutorial Outline 1. Introduction – NIF Basics – NIF Corpora

29 Language Resources and Linked Data

EKAW 2014 tutorial

•  Wikipedia abstracts corpus in progress •  Corpora available at

http://datahub.io/dataset?tags=nif&q=nif –  search for tag “nif” on datahub

NIF Corpora Overview

Name Size (in triples) Wikilinks 500M News-100 13K RSS-500 10K Reuters-128 7K Spotlight 3K KORE50 2K Brown 500K

Page 30: Language Resources and Linked Datalider-project.eu/sites/default/files/ekaw14/EKAW2014-tutorial-nif-complete...EKAW 2014 tutorial Outline 1. Introduction – NIF Basics – NIF Corpora

30 Language Resources and Linked Data

EKAW 2014 tutorial

Wikilinks Corpus

•  Large scale coreference resolution corpus by Umass/Google

•  Over 10M crawled websites that contain text (Named Entities) linked to Wikipedia

•  Converted to the NIF format and published as LOD –  more info here: http://wiki-link.nlp2rdf.org/

•  Additional processing done to extract relevant text snippets, add DBpedia ontology classes, and coarse-grained classes (entity types)

•  Over 500 million triples, 79GB LOD, 12GB gzipped dumps •  Over 30 million links to over 3 million entities

Page 31: Language Resources and Linked Datalider-project.eu/sites/default/files/ekaw14/EKAW2014-tutorial-nif-complete...EKAW 2014 tutorial Outline 1. Introduction – NIF Basics – NIF Corpora

31 Language Resources and Linked Data

EKAW 2014 tutorial

Brown Corpus

•  Converted to the NIF format and published as Linked Data

–  more info here: http://brown.nlp2rdf.org/

•  Corpus showcases handling of POS tags in NIF •  POS tags mapped vie OliA to predefined categories <#char=643,647>

a nif:String , nif:Word , nif:RFC5147String ; nif:anchorOf "Jury"^^xsd:string ; nif:referenceContext <#char=0,> ; nif:oliaLink brown:NN ; nif:sentence <#char=619,777> ; nif:beginIndex "643"^^xsd:nonNegativeInteger ; nif:endIndex "647"^^xsd:nonNegativeInteger .

•  Categories can be used to query all resources of a certain POS regardless of the tagset used in the corpus

Page 32: Language Resources and Linked Datalider-project.eu/sites/default/files/ekaw14/EKAW2014-tutorial-nif-complete...EKAW 2014 tutorial Outline 1. Introduction – NIF Basics – NIF Corpora

32 Language Resources and Linked Data

EKAW 2014 tutorial

•  Querying all nouns using the OliA mapping

Brown Corpus – POS tags

Page 33: Language Resources and Linked Datalider-project.eu/sites/default/files/ekaw14/EKAW2014-tutorial-nif-complete...EKAW 2014 tutorial Outline 1. Introduction – NIF Basics – NIF Corpora

33 Language Resources and Linked Data

EKAW 2014 tutorial

Brown Corpus – POS tags

•  Querying all nouns using the OliA mapping

Page 34: Language Resources and Linked Datalider-project.eu/sites/default/files/ekaw14/EKAW2014-tutorial-nif-complete...EKAW 2014 tutorial Outline 1. Introduction – NIF Basics – NIF Corpora

34 Language Resources and Linked Data

EKAW 2014 tutorial

Outline

1.  Introduction –  NIF Basics –  NIF Corpora –  NIF Tools and Services

2.  Hands-on: NIF in action –  How to annotate strings –  How to query generated NIF and existing

corpora

Page 35: Language Resources and Linked Datalider-project.eu/sites/default/files/ekaw14/EKAW2014-tutorial-nif-complete...EKAW 2014 tutorial Outline 1. Introduction – NIF Basics – NIF Corpora

35 Language Resources and Linked Data

EKAW 2014 tutorial

NIF Tools

•  Available NIF tools: – Stanford Core NLP – OpenNLP – RDFace – Validator – CoNLL converter – …

Page 36: Language Resources and Linked Datalider-project.eu/sites/default/files/ekaw14/EKAW2014-tutorial-nif-complete...EKAW 2014 tutorial Outline 1. Introduction – NIF Basics – NIF Corpora

36 Language Resources and Linked Data

EKAW 2014 tutorial

NIF Dashboard

Page 37: Language Resources and Linked Datalider-project.eu/sites/default/files/ekaw14/EKAW2014-tutorial-nif-complete...EKAW 2014 tutorial Outline 1. Introduction – NIF Basics – NIF Corpora

37 Language Resources and Linked Data

EKAW 2014 tutorial

•  https://github.com/dbpedia-spotlight/dbpedia-spotlight/

NIF Tools: DBpedia Spotlight

Page 38: Language Resources and Linked Datalider-project.eu/sites/default/files/ekaw14/EKAW2014-tutorial-nif-complete...EKAW 2014 tutorial Outline 1. Introduction – NIF Basics – NIF Corpora

38 Language Resources and Linked Data

EKAW 2014 tutorial

NIF Tools: Stanford Core

Page 39: Language Resources and Linked Datalider-project.eu/sites/default/files/ekaw14/EKAW2014-tutorial-nif-complete...EKAW 2014 tutorial Outline 1. Introduction – NIF Basics – NIF Corpora

39 Language Resources and Linked Data

EKAW 2014 tutorial

NIF Tools: Stanford Core

Page 40: Language Resources and Linked Datalider-project.eu/sites/default/files/ekaw14/EKAW2014-tutorial-nif-complete...EKAW 2014 tutorial Outline 1. Introduction – NIF Basics – NIF Corpora

40 Language Resources and Linked Data

EKAW 2014 tutorial

NIF Tools: Stanford Core

Page 41: Language Resources and Linked Datalider-project.eu/sites/default/files/ekaw14/EKAW2014-tutorial-nif-complete...EKAW 2014 tutorial Outline 1. Introduction – NIF Basics – NIF Corpora

41 Language Resources and Linked Data

EKAW 2014 tutorial

NIF Tools: Stanford Core

Page 42: Language Resources and Linked Datalider-project.eu/sites/default/files/ekaw14/EKAW2014-tutorial-nif-complete...EKAW 2014 tutorial Outline 1. Introduction – NIF Basics – NIF Corpora

42 Language Resources and Linked Data

EKAW 2014 tutorial

Outline

1.  Introduction –  NIF Basics –  NIF Corpora –  NIF Tools and Services

2.  Hands-on: NIF in action –  How to annotate strings –  How to query generated NIF and existing

corpora

Page 43: Language Resources and Linked Datalider-project.eu/sites/default/files/ekaw14/EKAW2014-tutorial-nif-complete...EKAW 2014 tutorial Outline 1. Introduction – NIF Basics – NIF Corpora

43 Language Resources and Linked Data

EKAW 2014 tutorial

Overview

•  Github NLP2RDF web page and NIF online demos –  Dashboard –  Combinator

•  Examples –  How to annotate string

•  Snowball Steamer, OpenNLP

–  How to query generated NIF and existing corpora

Page 44: Language Resources and Linked Datalider-project.eu/sites/default/files/ekaw14/EKAW2014-tutorial-nif-complete...EKAW 2014 tutorial Outline 1. Introduction – NIF Basics – NIF Corpora

44 Language Resources and Linked Data

EKAW 2014 tutorial

NLP2RDF GitHub Website

•  https://github.com/NLP2RDF/

Page 45: Language Resources and Linked Datalider-project.eu/sites/default/files/ekaw14/EKAW2014-tutorial-nif-complete...EKAW 2014 tutorial Outline 1. Introduction – NIF Basics – NIF Corpora

45 Language Resources and Linked Data

EKAW 2014 tutorial

dashboard.nlp2rdf.aksw.org

Page 46: Language Resources and Linked Datalider-project.eu/sites/default/files/ekaw14/EKAW2014-tutorial-nif-complete...EKAW 2014 tutorial Outline 1. Introduction – NIF Basics – NIF Corpora

46 Language Resources and Linked Data

EKAW 2014 tutorial

NIF Combinator

Try at http://nlp2rdf.aksw.org

Page 47: Language Resources and Linked Datalider-project.eu/sites/default/files/ekaw14/EKAW2014-tutorial-nif-complete...EKAW 2014 tutorial Outline 1. Introduction – NIF Basics – NIF Corpora

47 Language Resources and Linked Data

EKAW 2014 tutorial

Example 1: Snowball Stemmer Wrapper

Page 48: Language Resources and Linked Datalider-project.eu/sites/default/files/ekaw14/EKAW2014-tutorial-nif-complete...EKAW 2014 tutorial Outline 1. Introduction – NIF Basics – NIF Corpora

48 Language Resources and Linked Data

EKAW 2014 tutorial

Snowball Stemmer Wrapper

•  Stemming – process for removing suffixes from words – CONNECT as common prefix for:

•  CONNECTED •  CONNECTION •  CONNECTING •  CONNECTION

Page 49: Language Resources and Linked Datalider-project.eu/sites/default/files/ekaw14/EKAW2014-tutorial-nif-complete...EKAW 2014 tutorial Outline 1. Introduction – NIF Basics – NIF Corpora

49 Language Resources and Linked Data

EKAW 2014 tutorial

Snowball Stemmer: How-To

1.  Open the USB stick folder 2.  Go to “NIF_tutorial_hands_on” folder 3.  Open the “instructions.txt” file in a text

editor 4.  Open a terminal 5.  Go to the “jar” folder

Page 50: Language Resources and Linked Datalider-project.eu/sites/default/files/ekaw14/EKAW2014-tutorial-nif-complete...EKAW 2014 tutorial Outline 1. Introduction – NIF Basics – NIF Corpora

50 Language Resources and Linked Data

EKAW 2014 tutorial

Snowball Stemmer: How-To

6.  Copy the first command of the instructions instructions.txt

java -jar snowball.jar -f text -i 'My favorite actress is Natalie Portman.’ •  -f parameter to specify the format •  -i parameter to specify the input text

7.  Paste the command in the terminal

Page 51: Language Resources and Linked Datalider-project.eu/sites/default/files/ekaw14/EKAW2014-tutorial-nif-complete...EKAW 2014 tutorial Outline 1. Introduction – NIF Basics – NIF Corpora

51 Language Resources and Linked Data

EKAW 2014 tutorial

Snowball Stemmer Wrapper

Page 52: Language Resources and Linked Datalider-project.eu/sites/default/files/ekaw14/EKAW2014-tutorial-nif-complete...EKAW 2014 tutorial Outline 1. Introduction – NIF Basics – NIF Corpora

52 Language Resources and Linked Data

EKAW 2014 tutorial

Snowball Stemmer Wrapper

NIF standard annotations

Snowball stem annotation

Annotation offsets

Page 53: Language Resources and Linked Datalider-project.eu/sites/default/files/ekaw14/EKAW2014-tutorial-nif-complete...EKAW 2014 tutorial Outline 1. Introduction – NIF Basics – NIF Corpora

53 Language Resources and Linked Data

EKAW 2014 tutorial

OpenNLP Wrapper

•  Back to the terminal and use the second command of the instructions

java -jar opennlp.jar -f text -i 'My favorite actress is Natalie Portman.’ –modelFolder ../model/

•  The –modelFolder parameter set the folder that contains the POS tagging trained models and tokenization

•  You might add the parameter --outfile output.ttl to store the NIF triples in a file

Page 54: Language Resources and Linked Datalider-project.eu/sites/default/files/ekaw14/EKAW2014-tutorial-nif-complete...EKAW 2014 tutorial Outline 1. Introduction – NIF Basics – NIF Corpora

54 Language Resources and Linked Data

EKAW 2014 tutorial

Example 2: Query Brown Corpus

Page 55: Language Resources and Linked Datalider-project.eu/sites/default/files/ekaw14/EKAW2014-tutorial-nif-complete...EKAW 2014 tutorial Outline 1. Introduction – NIF Basics – NIF Corpora

55 Language Resources and Linked Data

EKAW 2014 tutorial

•  Open the “/twinkle/example” folder •  Open the NIF_query_example file in a text

editor and copy the query •  Open the “/twinkle” folder and run the

command java –jar twinkle.jar

Page 56: Language Resources and Linked Datalider-project.eu/sites/default/files/ekaw14/EKAW2014-tutorial-nif-complete...EKAW 2014 tutorial Outline 1. Introduction – NIF Basics – NIF Corpora

56 Language Resources and Linked Data

EKAW 2014 tutorial

Twinkle GUI

Page 57: Language Resources and Linked Datalider-project.eu/sites/default/files/ekaw14/EKAW2014-tutorial-nif-complete...EKAW 2014 tutorial Outline 1. Introduction – NIF Basics – NIF Corpora

57 Language Resources and Linked Data

EKAW 2014 tutorial

Loading query in Twinkle

Page 58: Language Resources and Linked Datalider-project.eu/sites/default/files/ekaw14/EKAW2014-tutorial-nif-complete...EKAW 2014 tutorial Outline 1. Introduction – NIF Basics – NIF Corpora

58 Language Resources and Linked Data

EKAW 2014 tutorial

Loading query in Twinkle

Page 59: Language Resources and Linked Datalider-project.eu/sites/default/files/ekaw14/EKAW2014-tutorial-nif-complete...EKAW 2014 tutorial Outline 1. Introduction – NIF Basics – NIF Corpora

59 Language Resources and Linked Data

EKAW 2014 tutorial

Loading query in Twinkle

Page 60: Language Resources and Linked Datalider-project.eu/sites/default/files/ekaw14/EKAW2014-tutorial-nif-complete...EKAW 2014 tutorial Outline 1. Introduction – NIF Basics – NIF Corpora

60 Language Resources and Linked Data

EKAW 2014 tutorial

Loading query in Twinkle

Page 61: Language Resources and Linked Datalider-project.eu/sites/default/files/ekaw14/EKAW2014-tutorial-nif-complete...EKAW 2014 tutorial Outline 1. Introduction – NIF Basics – NIF Corpora

61 Language Resources and Linked Data

EKAW 2014 tutorial

Loading NIF Corpus

Page 62: Language Resources and Linked Datalider-project.eu/sites/default/files/ekaw14/EKAW2014-tutorial-nif-complete...EKAW 2014 tutorial Outline 1. Introduction – NIF Basics – NIF Corpora

62 Language Resources and Linked Data

EKAW 2014 tutorial

Loading NIF Corpus

Page 63: Language Resources and Linked Datalider-project.eu/sites/default/files/ekaw14/EKAW2014-tutorial-nif-complete...EKAW 2014 tutorial Outline 1. Introduction – NIF Basics – NIF Corpora

63 Language Resources and Linked Data

EKAW 2014 tutorial

Loading NIF Corpus

Page 64: Language Resources and Linked Datalider-project.eu/sites/default/files/ekaw14/EKAW2014-tutorial-nif-complete...EKAW 2014 tutorial Outline 1. Introduction – NIF Basics – NIF Corpora

64 Language Resources and Linked Data

EKAW 2014 tutorial

Loading NIF Corpus

Page 65: Language Resources and Linked Datalider-project.eu/sites/default/files/ekaw14/EKAW2014-tutorial-nif-complete...EKAW 2014 tutorial Outline 1. Introduction – NIF Basics – NIF Corpora

65 Language Resources and Linked Data

EKAW 2014 tutorial

Loading NIF Corpus

Huray! We have all the words in the corpus! ☺

Page 66: Language Resources and Linked Datalider-project.eu/sites/default/files/ekaw14/EKAW2014-tutorial-nif-complete...EKAW 2014 tutorial Outline 1. Introduction – NIF Basics – NIF Corpora

66 Language Resources and Linked Data

EKAW 2014 tutorial

Example 3: Querying your own NIF annotated string

Page 67: Language Resources and Linked Datalider-project.eu/sites/default/files/ekaw14/EKAW2014-tutorial-nif-complete...EKAW 2014 tutorial Outline 1. Introduction – NIF Basics – NIF Corpora

67 Language Resources and Linked Data

EKAW 2014 tutorial

Annotate using NIF Wrapper

•  Querying your own NIF annotated string 1.  Annotated your string using one of the

wrappers 2.  Save your annotated sentence to a file

•  set the --outfile parameter

3.  Open Twinkle 4.  Query your string using Twinkle

Page 68: Language Resources and Linked Datalider-project.eu/sites/default/files/ekaw14/EKAW2014-tutorial-nif-complete...EKAW 2014 tutorial Outline 1. Introduction – NIF Basics – NIF Corpora

68 Language Resources and Linked Data

EKAW 2014 tutorial

Query your string

•  Querying your annotated string: – nif:Context – nif:Sentence – nif:anchorOf – nif:oliaCategory – nif:oliaLink

… or practice with the Brown Corpus!

Page 69: Language Resources and Linked Datalider-project.eu/sites/default/files/ekaw14/EKAW2014-tutorial-nif-complete...EKAW 2014 tutorial Outline 1. Introduction – NIF Basics – NIF Corpora

69 Language Resources and Linked Data

EKAW 2014 tutorial

Thank you!

http://nlp2rdf.org http://github.com/NLP2RDF