Leiden University. The university to discover.
DMT Week 3
Adriaan van der Weel and Peter Verhaar
Leiden University. The university to discover.
Where do we stand?
Leiden University. The university to discover.
Principles of markup- HTML:
- Document instance (your CV)- Stylesheet (css)
- Application- Document instance (your CV)- Stylesheet (css)- DTD/Schema- Add: Prologue (XML decl.; DTD)
Leiden University. The university to discover.
Text and markup
Leiden University. The university to discover.
Knowledge representation- Structure and content- Ontology
- What knowable things exist - What are the relationships that hold
between them- Tree diagram
- The book has structure and content: chapters, paragraphs, footnotes, etc.
- XML represents structure and content- Various ontologies - various DTDs
Leiden University. The university to discover.
XML Basics 1- Elements <p>...</p>- Attributes <title
type=play>...</title>- Entities
- Character: è = è- General entities, referencing:
• Chunks of text defined elsewhere• Text or image files, etc. • E.g., <p>The &BTCP; aims to ... </p>
- Well-formedness, validation- Prologue (XML decl.; DTD)
Leiden University. The university to discover.
XML Basics 2- Open standard (cf de facto standard):
- Publicly available- Royalty-free- Fully and publicly documented
- NB: ‘Who owns your data?’- (Lower) ASCII and Unicode:
- Platform and software independent- Software independent- Device independent
Leiden University. The university to discover.
Open standards 1- Open standards in a networking
world- Why?- Which? E.g., Internet Protocol Suite:
- Link layer (physical/data, e.g., ethernet)
- Internet layer, facilitating transport, e.g., IP
- Transport layer, e.g. TCP- Application layer, e.g., HTTP,
SMTP, FTP
Leiden University. The university to discover.
Open standards 2- E.g.:
- File format: Pdf, txt- Programming language: PHP,
Linux- Style language: CSS, XSLT- Markup metalanguage: SGML, XML- Markup language: DocBook, HTML,
EAD, TEI
Leiden University. The university to discover.
TEI basics- Text Encoding Initiative, 1987- Text exchange in the humanities- TEI is a DTD
- TEI is a collection of DTD fragments or modules
- Platform and software independent (ASCII); open standard; open source
- Used in an XML application (diagram)
- Document ‘instances’ should be validated against the TEI DTD
Leiden University. The university to discover.
TEI DTD- The TEI DTD is modular. We use:
- <!DOCTYPE TEI PUBLIC "-//TEI P5//DTD Main Document Type//EN" "http://www.tei-c.org/release/xml/tei/schema/dtd//tei.dtd" [
<!ENTITY % TEI.header "INCLUDE"> <!ENTITY % TEI.core "INCLUDE"> <!ENTITY % TEI.textstructure "INCLUDE"> <!ENTITY % TEI.transcr "INCLUDE"> <!ENTITY % TEI.linking "INCLUDE"> <!ENTITY % TEI.namesdates "INCLUDE"> ]>
http://www.tei-c.org/release/xml/tei/schema/dtd/
Leiden University. The university to discover.
Why this rigmarole?
- Print (‘Order of the Book’):- Author’s brain > Book > reader’s brain- Instrument: typography
- Digital (‘Digital Order’?):- Author’s brain > Computer > reader’s
brain- Instrument: markup- For both typography(=form) and
content
- So: Need to make text intelligent
Leiden University. The university to discover.
Using the computer / UM- Author’s brain > Computer > reader’s
brain
- Vary output format (paper, pdf, html, mobile phone, etc.)
- Exchange- Reuse - Search and select- Count- Change content (order) and form- Etcetera
Leiden University. The university to discover.
New research questions?- Chris Anderson (The Long Tail), in Wired
‘The end of theory’- But: need for hypothesis remains- But: humanities data:
- Quantity: not such a wealth of data. Bitty. Discontinuous.
- Quality: narrative, evaluative, ambiguous, subjective, conceptual
- Who decides the agenda? Need to lead, rather than follow.
Top Related