Folt - Open TMS - A presentation for universities

38
FOLT Überblick Stand 03.07.2009; Dr. Klemens Waldhör 1 Dr. Klemens Waldhör [email protected]

description

The slides give an overview of openTMS and its architecture.

Transcript of Folt - Open TMS - A presentation for universities

Page 1: Folt - Open TMS - A presentation for universities

FOLT Überblick Stand 03.07.2009; Dr. Klemens Waldhör 1

Dr. Klemens Waldhö[email protected]

Page 2: Folt - Open TMS - A presentation for universities

FOLT Überblick Stand 03.07.2009; Dr. Klemens Waldhör2

Overview

� Open TMS Overview

� Architecture

� Implementation

� Current Status

Page 3: Folt - Open TMS - A presentation for universities

FOLT Überblick Stand 03.07.2009; Dr. Klemens Waldhör 3

Overview

Page 4: Folt - Open TMS - A presentation for universities

FOLT Überblick Stand 03.07.2009; Dr. Klemens Waldhör4

Goals

� Development of the OpenSource Translation

Memory system OpenTMS

Three translation memory systems for one and the same process? Software investments that make translation costs shoot through the roof? Exchange formats that put the brakes on productivity? FOLT (Forum Open Language Tools) is concerned with the entire process of producing multilingual documentation. From the creation of the source text to production in foreign languages, we analyze our processes for weaknesses and a lack of standardisation.

Primary objectives:- Sharing experiences of processes using standard industry software - Sharing experiences of the use of Open Source software - Standardisation of interchange formats -Testing new Open Source technologies and improving existing technologies in the translation market - Public support for non-proprietary software and software development - Publication of aims and results

www.folt.de

Page 5: Folt - Open TMS - A presentation for universities

FOLT Überblick Stand 03.07.2009; Dr. Klemens Waldhör5

OpenTMS Requirements

� Software� Web based application� Server / Client Architecture� Thin client� No installation� No proprietary run time components� Preferred open source software� Modular software approach

� OS independent operating system� Windows, Linux, Mac …

� Standard hardware � Interfaces

� Integration into CMS� Workflow management should be supported

� Open source database� Basically all SQL da-tabases should be supported

� Scalability� Single and multi user requirement

Page 6: Folt - Open TMS - A presentation for universities

FOLT Überblick Stand 03.07.2009; Dr. Klemens Waldhör 6

Architecture

Page 7: Folt - Open TMS - A presentation for universities

FOLT Überblick Stand 03.07.2009; Dr. Klemens Waldhör7

TranslationMemory

Converter

Back Converter

MachineTranslation

OpenTMSEditor

Segmenter

TerminologyTranslation

XLIFF

Example Work Flow� Seamless integration of different tools in the translation / localisation workflow

CMS1.

2.

3.

Page 8: Folt - Open TMS - A presentation for universities

FOLT Überblick Stand 03.07.2009; Dr. Klemens Waldhör8

Architecture based on Standards

� XLIFF

� TMX

� TBX

� SRX

� …

In general XML

Page 9: Folt - Open TMS - A presentation for universities

FOLT Überblick Stand 03.07.2009; Dr. Klemens Waldhör9

Application Model

UserModel

DataModel

DocumentModel

ProcessModel

Security Model

GUI Model Interface Model

OpenTMS Core Library

OpenTMS System Architecture

For details see Waldhör, K. (2008). OPENTMS SOFTWARE ARCHITECTURE.

Page 10: Folt - Open TMS - A presentation for universities

FOLT Überblick Stand 03.07.2009; Dr. Klemens Waldhör10

OpenTMS primitiveprocedure

OpenTMS Process

OpenTMS Network Process

OpenTMS corelibrary

Software Structure

� Hierarchy of functions and processes

� Common functions / methods stored in a core library

� Method calls should be transparent� Running on server or user machine

� Scripting language

Page 11: Folt - Open TMS - A presentation for universities

FOLT Überblick Stand 03.07.2009; Dr. Klemens Waldhör11

Modelling Language

Monolingual Object Multilingual Object

General Linguistic Object

N:1

inherits

Data Source

Terminology

Translation Memory

mapping

Linguistic Property N:1

Page 12: Folt - Open TMS - A presentation for universities

FOLT Überblick Stand 03.07.2009; Dr. Klemens Waldhör12

PreTranslation

MemoryConverter Back

Converter

OpenTMSTranslation

EditorSegmenter

InteractiveTerminologyTranslation

InteractiveTranslation

Memory

Data SourceHuman Initiated Interactions

OpenTMS Initiated Interactions

OpenTMS Processes

Page 13: Folt - Open TMS - A presentation for universities

FOLT Überblick Stand 03.07.2009; Dr. Klemens Waldhör 13

Implementation

Page 14: Folt - Open TMS - A presentation for universities

FOLT Überblick Stand 03.07.2009; Dr. Klemens Waldhör14

Programming Language et al

� Java� Java Coding Standards

� Java Documentation Standard

� Delivered as jar files

� Eclipse

� Data Sources� SQL DB: Hibernate based

� Documentation UML� Generated ESS Model

Page 15: Folt - Open TMS - A presentation for universities

FOLT Überblick Stand 03.07.2009; Dr. Klemens Waldhör15

Data Sources

� Language related data are represented as “data sources”� Idea

� Make the data access interface independent from the data itself� Not being restricted to SQL databases only

� Also flat data or xml files� TMX, XLIFF files as a data source� …

� Machine translation (MT) as data source� Spread sheets

� E.g. Excel as terminology lists� Object Oriented Databases� DMS systems� “Web Sites” (http based interfaces)

� Define a common interface for all access functions� Allows adaption to individual data source properties

� e.g. read only data sources like MT

Page 16: Folt - Open TMS - A presentation for universities

FOLT Überblick Stand 03.07.2009; Dr. Klemens Waldhör16

OPENTMS

SOFTWARE

OpenTMS Data

SourceLayer

Data type specific access

functions

Maps the OpenTMSaccess functions to the

specific data component

Access to data sources through

standardised interface

Various data components like files

etc.

Data Sources

Page 17: Folt - Open TMS - A presentation for universities

FOLT Überblick Stand 03.07.2009; Dr. Klemens Waldhör17

Core Data Model Status

� Data Source methods defined� Are extended depending on needs and requirements

� SQL� Access optimisation� Hibernate based� First version finished

� Other OpenSource databases…� OODBS

� DB4O partially implemented for testing purposes� Other data sources

� TMX files� XLIFF files� MT

� Google & Microsoft Translator

Page 18: Folt - Open TMS - A presentation for universities

FOLT Überblick Stand 03.07.2009; Dr. Klemens Waldhör18

Data Source Core Functions

� Data Sources

� Create

� Delete

� Import TMX, XLIFF File

� Export TMX, XLIFF File

� Copy between data sources

Page 19: Folt - Open TMS - A presentation for universities

FOLT Überblick Stand 03.07.2009; Dr. Klemens Waldhör19

Fuzzy Search – Core Function of TM

� Step 1: Search in KD-TREE� Restricts the number of strings to search

� Finds possible matching strings

� Step 2: Levenshtein Similarity� Compare matches from step 1 now to determine

real similarity

� Step 3: Get source and target MOLs / MUL� Create translation (alt-trans)

Page 20: Folt - Open TMS - A presentation for universities

FOLT Überblick Stand 03.07.2009; Dr. Klemens Waldhör20

Data Source Configuration

� SQL Data Source contained in hibernate directory

� Existing data sources contained in database directory

Page 21: Folt - Open TMS - A presentation for universities

FOLT Überblick Stand 03.07.2009; Dr. Klemens Waldhör21

Translation Core Functions

� Convert (to and from XLIFF)� Currently externally done Araya

� Complex document format like WinWord etc. thru Open Office Converters

� Segment� Currently external Araya

� Translate

Page 22: Folt - Open TMS - A presentation for universities

FOLT Überblick Stand 03.07.2009; Dr. Klemens Waldhör22

Current Data Source Interface

Page 23: Folt - Open TMS - A presentation for universities

FOLT Überblick Stand 03.07.2009; Dr. Klemens Waldhör 23

Security

Managing Security

Page 24: Folt - Open TMS - A presentation for universities

FOLT Überblick Stand 03.07.2009; Dr. Klemens Waldhör24

Security Levels

� Level 0� No security procedures are applied, data are transferred as

they are.

� Level 1� The communication channel is secured. It uses standard

secure protocols here.

� Level 2� Encoding for security is done here on data level. Basically

this means that strings are encrypted when the are communicated through a communication channel or are written or retrieved from a database. This also involves encrypted XLIFF files (resp. parts of it).

� Level 4� GUI level related security

Page 25: Folt - Open TMS - A presentation for universities

FOLT Überblick Stand 03.07.2009; Dr. Klemens Waldhör25

Security and Files

� Protection of parts of the document� Encrypt specific parts of

the xml documents� Additional security

when transferring files� Even if a file gets in the

wrong hands the file cannot be read.

� Secure XLIFF� Source� Target

� Secure TBX� Secure TMX

� TU…

Page 26: Folt - Open TMS - A presentation for universities

FOLT Überblick Stand 03.07.2009; Dr. Klemens Waldhör 26

Security

Eclipse

Page 27: Folt - Open TMS - A presentation for universities

FOLT Überblick Stand 03.07.2009; Dr. Klemens Waldhör27

Eclipse Core Methods

Page 28: Folt - Open TMS - A presentation for universities

FOLT Überblick Stand 03.07.2009; Dr. Klemens Waldhör28

Eclipse RPC Server & Utility Methods

Page 29: Folt - Open TMS - A presentation for universities

FOLT Überblick Stand 03.07.2009; Dr. Klemens Waldhör29

Eclipse GUI Methods

Page 30: Folt - Open TMS - A presentation for universities

FOLT Überblick Stand 03.07.2009; Dr. Klemens Waldhör30

XML-RPC Interface

� openTMX.xml contains access functions

call env.bat

call java -Xmx1024m %OPENTMSJAVABASE%

de.folt.rpc.client.OpenTMSClient

message=TranslateDocument

sourceDocument=%2

sklDocument=%2.skl"

xliffDocument=%2.xlf"

segDocument=%2.seg.xlf„

translatedDocument=%2.trans.xlf"

paragraphBasesSegmentation=yes"

segmentBreakOnCrLf=1

dataSourceName=%1

dataSourceMatchQuality=80

sourceDocumentLanguage=de

targetDocumentLanguage=en

sourceDocumentEncoding=UTF-8

targetDocumentEncoding=UTF-8

inputDocumentType=FILE

dataSourceType=sql

Page 31: Folt - Open TMS - A presentation for universities

FOLT Überblick Stand 03.07.2009; Dr. Klemens Waldhör31

Current Implementation

� openTMS.jar� Contains compiled classes and source code

� arayaserver-opentms.jar� Conversion functions

� Compiled classes

� External.jar� External classes for Araya (parser etc.)

� Hibernate directory � Hibernate jar files

� Database jdbc driver� Database driver jar files

Page 32: Folt - Open TMS - A presentation for universities

FOLT Überblick Stand 03.07.2009; Dr. Klemens Waldhör32

Integration Araya XLIFF Editor

Page 33: Folt - Open TMS - A presentation for universities

FOLT Überblick Stand 03.07.2009; Dr. Klemens Waldhör33

Ubuntu VM Distribution

Page 34: Folt - Open TMS - A presentation for universities

FOLT Überblick Stand 03.07.2009; Dr. Klemens Waldhör34

Data Source Editor

Edit MOL/MOL Properties

Search Functions

Delete & Save Functions

Language Specific Segments

Page 35: Folt - Open TMS - A presentation for universities

FOLT Überblick Stand 03.07.2009; Dr. Klemens Waldhör35

Downloads

� http://sourceforge.net/projects/open-tms

� Ubuntu Version� Windows Version:

www.heartsome.de/arayatest/opentmsserver.exe� Im Xliff Editor:

www.heartsome.de/arayatest/araya-freeversion.exe

� YourKit Java Profiler for performance measurements

Page 36: Folt - Open TMS - A presentation for universities

FOLT Überblick Stand 03.07.2009; Dr. Klemens Waldhör36

Possible Contributions

� XML Parser!� Generalise OpenTMS XML interfaces to support any kind

of xml parsers (currently jdom)� Faster XML parser?!

� Logging Packing� Optimised, line numbers, class names

� Exception Handling� Improvement� Localisation approach / String handling

� Test Environment� XLIFF / TMX package improvements

� TBX reader� SRX segmentation

Page 37: Folt - Open TMS - A presentation for universities

FOLT Überblick Stand 03.07.2009; Dr. Klemens Waldhör37

Possible Contributions Converters

� Document Converters� XML� OpenOffice as central converter for txt, rtf, doc,

xls, ppt…� MIF� …

� Data Model Converter� Trados� Star� Across� …

Page 38: Folt - Open TMS - A presentation for universities

FOLT Überblick Stand 03.07.2009; Dr. Klemens Waldhör38

Contact

Heartsome Europe GmbHFriedrichstr. 17D-90574 Roßtal

www.heartsome.de

Dr. Klemens Waldhör

T: +49 9127 579001F: +49 9127 951178 [email protected]