Building Apps With Rni Rnt

17
GOVERNMENT USERS Conference “Navigating the Human Terrain” College Park, MD, May 20-21, 2008 Building Applications with Building Applications with Rosette Name Indexer & Rosette Name Indexer & Rosette Name Translator Rosette Name Translator Benson Margulies CTO Basis Technology

Transcript of Building Apps With Rni Rnt

Page 1: Building Apps With Rni Rnt

8/2/2019 Building Apps With Rni Rnt

http://slidepdf.com/reader/full/building-apps-with-rni-rnt 1/17

GOVERNMENT USERS

Conference“Navigating the Human Terrain”

College Park, MD, May 20-21, 2008

Building Applications withBuilding Applications with

Rosette Name Indexer &Rosette Name Indexer &

Rosette Name TranslatorRosette Name Translator

Benson MarguliesCTO

Basis Technology

Page 2: Building Apps With Rni Rnt

8/2/2019 Building Apps With Rni Rnt

http://slidepdf.com/reader/full/building-apps-with-rni-rnt 2/17

2

Introducing RNI and RNT

Rosette Name Indexer – Resolving Names Stores names 'on disk'

Queries on data or meta-data

Rosette Name Translator – Translating Names Translation vs. Transliteration

— Cologne, Köln

—George Bush, Jeorge Buzh

There isn't always only one right answer

Page 3: Building Apps With Rni Rnt

8/2/2019 Building Apps With Rni Rnt

http://slidepdf.com/reader/full/building-apps-with-rni-rnt 3/17

3

Coding for RNI and RNT

Common concepts and data structures RNI Application Programming

RNT Application Programming

Page 4: Building Apps With Rni Rnt

8/2/2019 Building Apps With Rni Rnt

http://slidepdf.com/reader/full/building-apps-with-rni-rnt 4/17

4

A Note On Programming Languages

RNI and RNT aimed at Java applications RNT has subset API in C++

Both have web services

This talk looks at Java

Page 5: Building Apps With Rni Rnt

8/2/2019 Building Apps With Rni Rnt

http://slidepdf.com/reader/full/building-apps-with-rni-rnt 5/17

5

What's in a Name?

Data – the name itself  Language

Script

Entity Type Unique ID

Entity ID

Arbitrary String Transliterations

Class:com.basistech.rnm.Name

Properties listed here.

Page 6: Building Apps With Rni Rnt

8/2/2019 Building Apps With Rni Rnt

http://slidepdf.com/reader/full/building-apps-with-rni-rnt 6/17

6

The minimum ...

What do you need to store, translate, or resolvenames?

Data – the text of the name (e.g. Albert Schweitzer)

Language – ISO639 code from

com.basistech.util.LanguageCode enum.

Script – ISO15924 code from

com.basistech.util.ISO15924 util.

This is all you must have.

Page 7: Building Apps With Rni Rnt

8/2/2019 Building Apps With Rni Rnt

http://slidepdf.com/reader/full/building-apps-with-rni-rnt 7/177

The optional fields

Connections to other things Unique ID – for your own cross-references

Entity ID – if you want to group multiple names of the

same entity.

Conventional Translation: 'transliterations'

Entity Type (person, place, etc.)

Plus, whatever you want – extra

Consider 'serializable'

Page 8: Building Apps With Rni Rnt

8/2/2019 Building Apps With Rni Rnt

http://slidepdf.com/reader/full/building-apps-with-rni-rnt 8/178

Text Domains

Text Domain describes a name Three fields:

Language

Script Scheme ...

Example:

ar/Arab/Native

ar/Latn/Folk

For translation, 'pair' specifies input domain and

output domain.

Page 9: Building Apps With Rni Rnt

8/2/2019 Building Apps With Rni Rnt

http://slidepdf.com/reader/full/building-apps-with-rni-rnt 9/179

What's a Scheme?

Schemes identify standards for translating ortransliterating names: e.g. IC, BGN

Schemes name other representations of names:

FOLK – an informal transliteration NATIVE – the original orthgraphy

Page 10: Building Apps With Rni Rnt

8/2/2019 Building Apps With Rni Rnt

http://slidepdf.com/reader/full/building-apps-with-rni-rnt 10/1710

Using the Rosette Name Index

Creating a New Index com.basistech.rnm.index.StandardNameIndex.create

Give it a pathname and options ...

It gives you back an INameIndex

Support for in-memory indices.

Opening an Existing Index

.open instead of create

Note that an index is, physically, a directory.

Page 11: Building Apps With Rni Rnt

8/2/2019 Building Apps With Rni Rnt

http://slidepdf.com/reader/full/building-apps-with-rni-rnt 11/1711

Storing Names

Create a Name object Add it to the INameIndex

Batching and concurrency

By default, additions are only seen by 'adder'

Page 12: Building Apps With Rni Rnt

8/2/2019 Building Apps With Rni Rnt

http://slidepdf.com/reader/full/building-apps-with-rni-rnt 12/1712

Querying for Names

Filling up a query object com.basistech.rnm.index.NameIndexQuery

Fields for the data and various metadata

Flags to enable the fields

Very simple query model ... if you need a SQL

database, you should use one.

Retrieving results

INameIndex.lookup Then obtain the iterator

Data versus metadata queries

Page 13: Building Apps With Rni Rnt

8/2/2019 Building Apps With Rni Rnt

http://slidepdf.com/reader/full/building-apps-with-rni-rnt 13/1713

Using the Rosette Name Translator

A Translator object Translates: com.basistech.rnt.ITranslator

'Basic' Translators implement one domain pair

e.g. ar/Arab/Native -> ar/Latn/IC com.basistech.rnt.BasicTranslator

Basic Translators come from a Factory

com.basistech.rnt.BasicTranslatorFactory

Page 14: Building Apps With Rni Rnt

8/2/2019 Building Apps With Rni Rnt

http://slidepdf.com/reader/full/building-apps-with-rni-rnt 14/1714

RNT – Rule-based Translator

com.basistech.rnt.RuleSetTranslator Chooses a translator based on input domain and

the entity type.

Example: People with IC, places with BGN. Spring is convenient for configuration.

Page 15: Building Apps With Rni Rnt

8/2/2019 Building Apps With Rni Rnt

http://slidepdf.com/reader/full/building-apps-with-rni-rnt 15/1715

Translators take Options

Different translators accept different options Options are object from:

com.basistech.rnt.options

Example: option controls whether to deliverenhanced version of original input, e.g. adding

Harakat to Arabic.

Page 16: Building Apps With Rni Rnt

8/2/2019 Building Apps With Rni Rnt

http://slidepdf.com/reader/full/building-apps-with-rni-rnt 16/1716

Translation Process

'translate' takes an ITranslatable. com.basistech.rnm.Name implements.

'translate' returns a List<

com.basistech.rnt.TranslationResult

>

Each result has string, confidence, and

additional information

e.g. improved spelling of input

Page 17: Building Apps With Rni Rnt

8/2/2019 Building Apps With Rni Rnt

http://slidepdf.com/reader/full/building-apps-with-rni-rnt 17/1717

Conclusion

Bad news: you will still have to read thedocumentation and look at the examples.

Good news: you should have an overall picture of 

the main classes and interfaces that you will use

to integrate RNI and RNT into applications.

And don't forget:

[email protected]