20 ways to mark up a sentence

download 20 ways to mark up a sentence

If you can't read please download the document

Transcript of 20 ways to mark up a sentence

20 Ways to mark up a sentence

Stuart YeatesNew Zealand Electronic Text Centre

Who am I?

Computer Science PhD (Waikato) by training

4 years at Oxford, working mainly in TEI

Several years at Victoria working party on TEI

Immersed in a library tradition

Actively building New Zealand / Mori content

Current member of the TEI Council

Who are you?

What is the TEI?

Consortia of people and institutions

Heavy on:Linguists (British National Corpus, dictionaries)

Medievalists (EEBO)

Dramatists / playwrights

Diarists

Produce a TEI schema and range of tools

20 ways to mark up a sentence 1

John has a horse.

New tag: ... indicates that the text within the tag is a paragraph

This syntax is identical to that of HTML

Unlike HTML, the TEI version has specified semantics

20 ways to mark up a sentence 2

John has a horse.

New tag: ... indicates that the text within the tag is a sentence

20 ways to mark up a sentence 3

John has a horse.

New tag: ... indicates that the text within the tag is a phrase

The type and function attributes provide additional information

20 ways to mark up a sentence 4

John has a horse.

The xml:id attribute provides a unique reference for this paragraph

In combination with the URL at which the document is found, represents a globally unique identifier

Used to build replacements for Chapter 3, page 45, line 5

http://www.nzetc.org/tm/scholarly/tei-GorLaws-t1-g1-t1-body1-d1-d34.html#tei-GorLaws-t1-g1-t1-body1-d1-d34

20 ways to mark up a sentence 5

John has a horse.

New tag: ... indicates that the text within the tag is a word

The type attribute tells us what kind of word

20 ways to mark up a sentence 6

John has a horse.

New tag: ... indicates that the text within the tag is a name

The key attribute gives us the key into a foreign database to identify the named entity

Links to an authority control system

20 ways to mark up a sentence 7

John has a horse.

The xml:lang attribute indicates the language of a passage

en, mi very common in our collections, also la, sm, rap

Dialect, regional and temporal variations can also be encoded en_NZ, mi_Trad, mi_Modr

Used for searching and analysis

20 ways to mark up a sentence 8

John has a horse called r.

New tag: ... indicates that the text within the tag is in a foreign language

Foreign defined relative to the surrounding text not the broader context.

The spectrum between foreign words and loan words is complex

20 ways to mark up a sentence 9

John has a horse.

N Hone ttahi hiho.

The corresp attribute links together two similar passages

Used when there are multiple representations below the work level

20 ways to mark up a sentence 10

John has a horse dog.

New tag: ... indicates that the text within the tag is an addition

New tag: ... indicates that the text within the tag is a deletion

20 ways to mark up a sentence 11

Mary Smith Jane Smith ... John has a horsedog.

New tag: ... a creator in a fine-grained sense

The hand attribute assigns responsibility for a tag

The '#' character indicates a pointer to a xml:id

#, xml:id and xml:lang work in all modern XML

20 ways to mark up a sentence 12

John has horse.

New tag: indicates that the text within the tag is a unclear or uncertain

Responsibility can be assigned using the hand attribute

20 ways to mark up a sentence 13

John has a horse.

New tag: indicates a text quoted or embedded within another

Treaty of Waitangi

20 ways to mark up a sentence 14

John has a horse,of course, of course

New tag: indicates a line group

New tag: indicates a line

Poetry, drama, song, liturgy, etc

20 ways to mark up a sentence 15

John has a horse,of course, of course

New tag: indicates words which are important in the rhyme scheme

The label attribute is used to mark which sets of words rhyme with which others

Can use IPA labels

20 ways to mark up a sentence 16

John has a horse,of course, of course

The attribute met indicates the metrical information for the line group

20 ways to mark up a sentence 17

John has a horse.

New tag: is used to synchronise media, either parallel texts or with audio, video, choreography, etc, etc.

Can use symbolic identifiers or relative times or absolute times

20 ways to mark up a sentence 18

John has a horse.

New tag: indicates a glyph which is special in some way.

Illuminated manuscripts, misprints, custom characters, etc., etc.

20 ways to mark up a sentence 19

Mary John has a horse.

New tag: indicates a passage of speech

New tag: indicates who is speaking

20 ways to mark up a sentence 21

Mary SmithJohn has a horse

New tag: a bibliographic reference

and hopefully self explanatory

used when complete bibliographic information is available

Can be stored in the header and referenced in the document body

20 ways to mark up a sentence 22

John has a horse.

New tag: is a tag like a paragraph but without the semantic baggage of a paragraph

New tag: spans zero or characters but makes no semantic assumptions about hose charactersNew tag: is point in the text whose marking makes no semantic assumptions

20 ways to mark up a sentence 23

New tag: includes content from a different place

Local files, web downloads, RSS feeds, lifting quotes directly from original, etc.,

Reuse, unreasonably large files, separation of fixed and dynamic content, etc.,

20 ways to mark up a sentence 24

John has a horse

New tags: compilation entries

Dictionaries, encyclopedia, lexicons, word lists, bi-lingual dictionaries, etc., etc.,

20 ways to mark up a sentence 25

John has a horse

New tag: used for notes

Footnotes, sidenotes, endnotes, etc., etc.,

Can be nested to arbitrary depth

Can be painful to render

20 ways to mark up a sentence 26

John has a horse

New tag: introduces a new character

New tag: names new character

Invented languages, minority languages, bizarre print practices, etc.,

Can be painful to render

20 ways to mark up a sentence 27

John has a horse.

New tag: indicates the placement of a page break, often links to page image

Milestone tag so it doesn't interrupt logical structure

Questions?

Which pub are we heading to?