GATE and UIMA in Language Technology...

33
GATE and UIMA in Language Technology Teaching Graham Wilcock University of Helsinki [email protected]

Transcript of GATE and UIMA in Language Technology...

GATE and UIMA inLanguage Technology Teaching

Graham WilcockUniversity of Helsinki

[email protected]

Graham Wilcock UIMA workshop, GLDV-2007 2

Outline• Current course materials

• Shakespeare’s Sonnets• GATE & ANNIE

• New course materials• UIMA & OpenNLP• UIMA & Stanford NLP

• Appendix• Eclipse & Stanford Eclipse

Graham Wilcock UIMA workshop, GLDV-2007 3

IBM: Tidwell XSLT tutorials

Graham Wilcock UIMA workshop, GLDV-2007 5

Gutenberg: Sonnets corpus

Graham Wilcock UIMA workshop, GLDV-2007 6

Outline• Current course materials

• Shakespeare’s Sonnets• GATE & ANNIE

• New course materials• UIMA & OpenNLP• UIMA & Stanford NLP

• Appendix• Eclipse & Stanford Eclipse

Graham Wilcock UIMA workshop, GLDV-2007 7

GATE & ANNIE

• Start with ANNIE• Ready-to-run NLP tools• ANNIE NE Recognizer• ANNIE POS Tagger

• Add JAPE annotations• Students write NP, PP rules

Graham Wilcock UIMA workshop, GLDV-2007 9

GATE & ANNIE

• Start with ANNIE• Ready-to-run NLP tools• ANNIE NE Recognizer• ANNIE POS Tagger

• Add JAPE annotations• Students write NP, PP rules

Graham Wilcock UIMA workshop, GLDV-2007 11

GATE & ANNIE

• Start with ANNIE• Ready-to-run NLP tools• ANNIE NE Recognizer• ANNIE POS Tagger

• Add JAPE annotations• Students write NP, PP rules

Graham Wilcock UIMA workshop, GLDV-2007 13

Outline• Current course materials

• Shakespeare’s Sonnets• GATE & ANNIE

• New course materials• UIMA & OpenNLP• UIMA & Stanford NLP

• Appendix• Eclipse & Stanford Eclipse

Graham Wilcock UIMA workshop, GLDV-2007 14

UIMA & OpenNLP

• OpenNLP tools• First, run from command line• Install in UIMA (assignment)

• UIMA & OpenNLP• OpenNLP POS Tagger• OpenNLP NE Recognizer

Graham Wilcock UIMA workshop, GLDV-2007 15

UIMA & OpenNLP

• OpenNLP tools• First, run from command line• Install in UIMA (assignment)

• UIMA & OpenNLP• OpenNLP POS Tagger• OpenNLP NE Recognizer

Graham Wilcock UIMA workshop, GLDV-2007 17

UIMA & OpenNLP

• OpenNLP tools• First, run from command line• Install in UIMA (assignment)

• UIMA & OpenNLP• OpenNLP POS Tagger• OpenNLP NE Recognizer

Graham Wilcock UIMA workshop, GLDV-2007 19

UIMA & OpenNLP• OpenNLP tools

• First, run from command line• Install in UIMA (assignment)

• UIMA & OpenNLP• OpenNLP POS Tagger• OpenNLP NE Recognizer

• Add Java annotators• Students write NP, PP annotators?

Graham Wilcock UIMA workshop, GLDV-2007 20

Outline• Current course materials

• Shakespeare’s Sonnets• GATE & ANNIE

• New course materials• UIMA & OpenNLP• UIMA & Stanford NLP

• Appendix• Eclipse & Stanford Eclipse

Graham Wilcock UIMA workshop, GLDV-2007 21

Stanford NLP Group

• Stanford NLP tools• Stanford POS Tagger• Stanford NE Recognizer• Stanford Parser

Graham Wilcock UIMA workshop, GLDV-2007 22

UIMA & Stanford NLP

• Stanford NE Recognizer• Use Stanford NER-GUI

• UIMA & Stanford NER• Install in UIMA (assignment)• UIMA wrapper by F. Laws

Graham Wilcock UIMA workshop, GLDV-2007 24

UIMA & Stanford NLP

• Stanford NE Recognizer• Use Stanford NER-GUI

• UIMA & Stanford NER• Install in UIMA (assignment)• UIMA wrapper by F. Laws

Graham Wilcock UIMA workshop, GLDV-2007 26

Outline• Current course materials

• Shakespeare’s Sonnets• GATE & ANNIE

• New course materials• UIMA & OpenNLP• UIMA & Stanford NLP

• Appendix• Eclipse & Stanford Eclipse

Graham Wilcock UIMA workshop, GLDV-2007 27

Eclipse & Stanford Eclipse

• jEdit vs. Eclipse• Students currently use jEdit• Eclipse learning curve is harder

• Stanford Eclipse• Stanford CS: simplified Eclipse• Karel the Robot Learns Java• Starter projects for assignments

Graham Wilcock UIMA workshop, GLDV-2007 29

Starter projects

Graham Wilcock UIMA workshop, GLDV-2007 33

Summary• Current course materials

• Shakespeare’s Sonnets• GATE & ANNIE

• New course materials• UIMA & OpenNLP• UIMA & Stanford NLP

• Appendix• Eclipse & Stanford Eclipse