UIMA SHARP 4 - NLP May 25, 2010. Outline UIMA Terminology (not just TLAs) Parts of a UIMA pipeline...
-
Upload
randall-wilkerson -
Category
Documents
-
view
225 -
download
1
Transcript of UIMA SHARP 4 - NLP May 25, 2010. Outline UIMA Terminology (not just TLAs) Parts of a UIMA pipeline...
UIMA
SHARP 4 - NLP
May 25, 2010
Outline
• UIMA Terminology (not just TLAs)
• Parts of a UIMA pipeline
• Running a pipeline
• Viewing annotations
• Creating a new annotator
UIMA terminology
• CAS XCAS JCAS View
• Analysis Engine (AE) / Annotator– Aggregate Analysis Engine
• XML output: XCAS XMI
• Type System JCasGen
• CAS Visual Debugger (CVD)
• CPE (Collection Processing Engine)
UIMA and Eclipse
• UIMA plugin for Eclipse requires EMF
• UIMA plugin provides visual editors for descriptors
• An “Update site” exists for installing plugin
UIMA Pipeline Flow
• Collection Reader• (CAS Initializer - deprecated)
• Analysis Engine (AE) / Annotator
• CAS Consumer
Pipeline Example
Example
Read files from a dir
Sentence annotator
Tokenizer annotator
Output tokens to a DB
UIMA term
Collection Reader
Analysis Engine
Analysis Engine
CAS Consumer
Options for running UIMA tools
• Tools:
– CPE Configurator
– CVD
• Options:
– Command line scripts/.bat files
– Run within Eclipse
Tying together a UIMA pipeline
• Type System
– Defines the data types passed along
• CAS (Common Analysis Structure)
– Container for the data
Tying together a UIMA pipeline
• CPE descriptor – select the parts– Collection Reader
– Analysis Engine(s)
– CAS Consumer
• Aggregate analysis engine– Multiple Analysis Engines and their order
Options for running a pipeline
• CVD GUI– Single Aggregate Analysis Engine
– No Collection Reader
• CPE GUI
• Instantiate a CpeDescription and invoke the process() method2.3. Running a CPE from Your Own Java Application
Example: Running a pipeline
Running cTAKES within Eclipse using a CPE
Use run configuration
UIMA_CPE_GUI--clinical_documents_pipeline
CPE
test1.xml
from clinical documents pipeline\desc\collection_processing_engine
Options for viewing annotations
• CVD
• Annotation viewer
• XML viewer
• Text editor
Example: Viewing annotations
Viewing annotations using the CVD
• Load the Type System• Load the XCAS or XMI
Example: Running an AE in CVD
Using CVD to run an Analysis Engine– No Collection Reader– Single Analysis Engine (can be an aggregate)– No CAS Consumer
– Just paste/type in text to processFamily history of hyperlipidemia.
Creating a New Annotator
• Create Java project
• Right click -> Add UIMA Nature
• Add UIMA jars to .classpath (Build Path)
• Create Analysis Engine (AE) descriptor
• Add types to AE descriptor, or optionally create separate Type System descriptor
• Write code!
Questions?
Supplemental slides follow
Example: Creating a PEAR file
• Right click -> Add UIMA Nature
• Right click -> Generate Pear
• Select Analysis Engine descriptor
• Select OS and JDK
• Modify Properties if needed
• Select what to include
Example: Modifying a parameter
UIMA’s descriptor editors allow you to modify most parameters without looking at the XML itself.
Links
• Getting started with UIMAhttp://uima.apache.org/doc-uima-annotator.html
• UIMA Update site for use in Eclipsehttp://www.apache.org/dist/incubator/uima/eclipse-update-site/