Introduction to ELAN Mary Chambers ELAP, Department of Linguistics, SOAS.

11
Introduction to ELAN Mary Chambers ELAP, Department of Linguistics, SOAS

Transcript of Introduction to ELAN Mary Chambers ELAP, Department of Linguistics, SOAS.

Page 1: Introduction to ELAN Mary Chambers ELAP, Department of Linguistics, SOAS.

Introduction to ELAN

Mary ChambersELAP, Department of Linguistics, SOAS

 

Page 2: Introduction to ELAN Mary Chambers ELAP, Department of Linguistics, SOAS.

What is ELAN?

EUDICO Linguistic Annotator Annotation tool developed by MPI: create, edit, view and search

annotations for video and audio data links text annotations with audio and/or video data. one audio stream, up to four video streams annotations are on tiers, these can be independent or linked to

other tiers. no limit to the number of tiers. tiers can be hidden or rearranged for ease of use ELAN files can be exported in a variety of formats (including to

Shoebox/Toolbox for interlinearisation, then reimported)

Page 3: Introduction to ELAN Mary Chambers ELAP, Department of Linguistics, SOAS.

Demonstration...

Page 4: Introduction to ELAN Mary Chambers ELAP, Department of Linguistics, SOAS.

Tiers, types, and stereotypes...

Imagine an annotated text with two speakers, with a transcription and free translationThere are 4 tiers: tx@speakerA,tx@speakerB,

ft@speakerA, ft@speakerBThere are 2 types of tier: tx (text), and ft (free

translation)Each 'type' is further categorised according to its

stereotype - the way tiers of this type combine with other tiers...

Page 5: Introduction to ELAN Mary Chambers ELAP, Department of Linguistics, SOAS.

Tiers

Each speaker can have their own set of tiers, so overlapping speech is not a problem.

Tiers can contain many kinds of annotations, some of the most obvious are: IPA transcription practical orthographic transcription free translations into languages of wider communication morphemes and gloss gesture annotation grammar notes any other information which seems relevant

Page 6: Introduction to ELAN Mary Chambers ELAP, Department of Linguistics, SOAS.

Linguistic types

Every annotation tier must be assigned a linguistic type which tells Elan what type of information the tier contains.

Stereotypes: None: The annotation on the tier is linked directly to the time axis (eg.

intonation units/sentences - a transcription or a reference number). Time Subdivision: The annotation on the parent tier can be sub-

divided into smaller units, which, in turn, can be linked to time intervals (eg. words). There cannot be gaps between units.

Symbolic subdivision: Similar to Time Subdivision, except that the smaller units cannot be linked to a time interval (eg. morphemes within words).

Included In: like Time Subdivision but there can be gaps (eg. words, with silence between them).

Symbolic association: one-to-one association with a parent tier, eg. transcription with ref field, gloss and morpheme, free translation with sentence.

Page 7: Introduction to ELAN Mary Chambers ELAP, Department of Linguistics, SOAS.

Tier dependencies: parents and children

Document X Types:Text/utterances (speaker A) (none)

Words (Time subdivision)Morphemes (symbolic subdivision)

Parts of speech (symbolic association)glosses (symbolic association)

Free translations (symbolic association)

Text/utterances (speaker B) Words

MorphemesParts of speechglosses

Free translations

Page 8: Introduction to ELAN Mary Chambers ELAP, Department of Linguistics, SOAS.

Is it worth it?

Time-alignment is time-consuming!Tiers, types, and stereotypes only have to be

set up onceOutput is time-aligned transcription in XML

which can be used for many purposesArchivalImport to Toolbox for interlinearisationImport to DVD-authoring software

Page 9: Introduction to ELAN Mary Chambers ELAP, Department of Linguistics, SOAS.

Different workflows are possible

ELAN files can be imported/exported in a variety of formats, including Shoebox/ToolboxToolbox → ELANELAN → ToolboxTranscriber → ELAN → ToolboxBack and forth?

Page 10: Introduction to ELAN Mary Chambers ELAP, Department of Linguistics, SOAS.

Working with Toolbox

This is not entirely straightforward, but is not too difficult if you are already quite familiar with the workings of Toolbox and the structure of its files.

If you know you want to export to Toolbox, it’s better to start from the beginning with a ref type and tier (stereotype: None) which will only contain time information now (ie. it will be empty), but later will contain a Toolbox ref number. The transcription tier will be a symbolic association depending from the ref tier

The Toolbox export process puts the time and speaker information in separate fields. After working in Toolbox, ELAN can import the file, and the time and speaker information will be preserved.

Page 11: Introduction to ELAN Mary Chambers ELAP, Department of Linguistics, SOAS.

Any questions?