DH101 2013/2014 course 5 - Project on Venice / Datafication / Regulated representations / XML

Post on 19-Aug-2014

1.329 views 3 download

Tags:

description

 

Transcript of DH101 2013/2014 course 5 - Project on Venice / Datafication / Regulated representations / XML

Digital Humanities 101 - 2013/2014 - Course 5

Digital Humanities Laboratory

Frederic Kaplan

frederic.kaplan@epfl.ch

Peer reviewing of blog posts has started this week. Yourreviews are expected by the time of next week’s course.You can change your grades till the last moment.

my header

Digital Humanities 101 - 2013/2014 - Course 5 | 2013 2o

Semester 1 : Content of each course

• (1) 19.09 Introduction to the course / Live Tweeting and Collective note

taking

• (2) 25.09 Introduction to Digital Humanities / Wordpress / First assignment

• (3) 2.10 Introduction to the Venice Time Machine project / Zotero

•9.10 No course

• (4) 16.10 Digitization techniques / Deadline first assignment

• (5) 23.10 Datafication / Presentation of projects

• (6) 30.10 Pattern recognition / OCR / Deadline peer-reviewing of first

assignment

my header

Digital Humanities 101 - 2013/2014 - Course 5 | 2013 3o

Semester 1 : Content of each course

• (7) 6.11 Semantic modelling / RDF

• (8) 13.11 Historical Geographical Information Systems, Procedural modelling

/ City Engine / Deadline Project selection

• (9) 20.11 Crowdsourcing / Wikipedia / OpenStreetMap

• (10) 27.11 Cultural heritage interfaces and visualisation / Museographic

experiences

•4.12 Group work on the projects

•11.12 Oral exam / Presentation of projects / Deadline Project blog

•18.12 Oral exam / Presentation of projects

my header

Digital Humanities 101 - 2013/2014 - Course 5 | 2013 4o

Structure of today’s course

•Presentation of the projects for semester 2

• Introduction to datafication and regulated representations (maps + textual

documents)

•A short introduction to a possible content encoding tool : XML

my header

Digital Humanities 101 - 2013/2014 - Course 5 | 2013 5o

Presentation of projects

my header

Digital Humanities 101 - 2013/2014 - Course 5 | 2013 6o

Context : The Venice Atlas. A book and an interactive site.

my header

Digital Humanities 101 - 2013/2014 - Course 5 | 2013 7o

Each project could be a section of this atlas (population,politics, timelines, etc.)

my header

Digital Humanities 101 - 2013/2014 - Course 5 | 2013 8o

What all the projects have in common

•Each project will start with a set of sources. We will suggest some possible

sources on dh101.ch, but you can add others.

•These sources should be digitised (course 4)

•The content of these sources will be transformed in a data model (course 5 +

7 + 8), possibly with automatic processes (course 6)

•This data model will be stored in a database (course 7 + 8), possibly

permitting other contributors to extend or improve the data (course 9)

•This data model will be the basis of both a static (a set of images) and

interactive visual representation (an HTML5, UNITY or or other site) (course

8 + course 10)

my header

Digital Humanities 101 - 2013/2014 - Course 5 | 2013 9o

What all the projects have in common

•The project will be conducted by groups of 2-3 students

•We will present a list of possible projects (but you can also invent your own

provided that it respects the common features and objectives described on the

previous slide)

my header

Digital Humanities 101 - 2013/2014 - Course 5 | 2013 10o

What you have do to

•Form groups and choose or invent a project (Deadline 13.11)•Use Framapad for this, put your name under the project you are interested in.

•Create an independent blog (NOT dh101.ch) for your project including(Deadline 11.12, 30 % of your final grade).•The definition of the project objectives and deliverables (100 words)

•A methodology section (How you will approach the digitization, modelisation and presentation of

your data) (750 words)

•A project plan with milestones

•Present the project orally in group on 11.12 or 18.12 (7 minutes presentations

+ 3 minutes questions, 20 % of your final grade)

my header

Digital Humanities 101 - 2013/2014 - Course 5 | 2013 11o

Timelines (T)

my header

Digital Humanities 101 - 2013/2014 - Course 5 | 2013 12o

Dominio daMar (T1)

Timeline of Dominio da Mar (cities,

fortresses, colonies)

The objective is to synthetize

chronogically the Venetians settlements

overseas. You will have to separate the

direct administration and the places

indirectly supervised by Venice. Territories

will appear and disappear over the

centuries.

my header

Digital Humanities 101 - 2013/2014 - Course 5 | 2013 13o

Dominio da Terra Ferma (T2)

Timeline of Dominio da Terra Ferma. The

goal is to see that Venice was also

powerful on the ground and locked the

key sites for exchanges and money :

rivers, cities, roads.

my header

Digital Humanities 101 - 2013/2014 - Course 5 | 2013 14o

Political structure (T3)

An evolution of the political and

administrative structure. The political and

administrative structure of Venice is

special. It’s a complex game of control

and retro-control. The objective here is to

visualize and to understand over the

years, how this system has been built and

what are the events at the origin of their

creation.

my header

Digital Humanities 101 - 2013/2014 - Course 5 | 2013 15o

Venetian cartography (T4)

History of Venetian Mapping from

Middle-Age to late Republic : from Fra

Mauro to Albrizzi Understand the

complex issues involved with mapping and

geographical representations in different

times. Following the work of prominent

Venetian cartographers via prominent

examples available online, visually

highlight the evolution of such craft.

my header

Digital Humanities 101 - 2013/2014 - Course 5 | 2013 16o

3D and procedural modeling (MP)

my header

Digital Humanities 101 - 2013/2014 - Course 5 | 2013 17o

A 3Dmodel of the Venetians ships (MP1)

A 3D model of the Venetians ships

(Galleys, Coques, Bucintauro...). The goal

of this project is to reconstruct in 3D the

model of some kinds of ships (including

the inside of ships !), based on the

documentation gathered by the DHLAB.

my header

Digital Humanities 101 - 2013/2014 - Course 5 | 2013 18o

Architectural grammars (MP2)

Automatic extraction of facades building

based on a picture. The objective of this

project is to build a system to extract the

architectural grammar of a building based

on a single picture and to use the

resulting models to recreate unknown

building using procedural approaches.

my header

Digital Humanities 101 - 2013/2014 - Course 5 | 2013 19o

The Lepanto battle (MP3)

A simulation of the Lepanto battle. The

Lepanto battle is still (with Trafalgar) one

of the greatest naval battles of the history.

It’s well documented and painted. The

objective of this project is to enter the

core of the battle and to go beyond the

narration or the simple 2D visualizations.

my header

Digital Humanities 101 - 2013/2014 - Course 5 | 2013 20o

Galley rowing (MP4)

How to row a Galley There were different

ways to row. The objective here is to

show in an interactive and didactic

manner the technics for moving those

giants of the seas.

my header

Digital Humanities 101 - 2013/2014 - Course 5 | 2013 21o

Facades of Venice (MP5)

A complete model of all the facade of

Venice. The goal of this project is to

create a database of all the facades of all

the buildings of Venice. The starting

point will be some existing 3D models like

one of Google Earth from which could be

extracted low quality pictures. The

challenge will be to improve these pictures

to create higher resolutions models.

my header

Digital Humanities 101 - 2013/2014 - Course 5 | 2013 22o

Data mining and pattern recognition (D)

my header

Digital Humanities 101 - 2013/2014 - Course 5 | 2013 23o

Tourists pictures (D1)

A ”Google Street View” of Venice. Based

on a large number of photo taken by

tourists is it possible to build a kind of

”Google street view” of Venice ? What

else can we extract from these pictures ?

my header

Digital Humanities 101 - 2013/2014 - Course 5 | 2013 24o

Ornaments in print (D2)

Matching techniques. Ornaments in print

offer a unique signature to identify the

origin of a printed documents. The goal

of the project is to extract from a

database of document ornaments

presented on each page and to design a

classifier permitting to attribute a given

set of ornaments to a given venetian

printer. The tool could be used to map

the diffusion of venetian prints

my header

Digital Humanities 101 - 2013/2014 - Course 5 | 2013 25o

Citations of the archive (D3)

Text mining. The goal of this project is to

identify which sections or documents of

the Archivio di Stato are most often used

by scholars. The project could use text

mining techniques on articles or scanned

books to create representations of the

parts of the archive that are the most

used

my header

Digital Humanities 101 - 2013/2014 - Course 5 | 2013 26o

Maritime Networks (S)

my header

Digital Humanities 101 - 2013/2014 - Course 5 | 2013 27o

Piracy and corsairs (S1)

A representation of the piracy/corsairs

areas in the Mediterranean Sea. Pirates

and corsairs are where the high values

cargoes are transiting. The project can

model one type or another or follow some

famous characters. The objective is to

localize the dangerous areas and the

conflicts with the Venetians maritime

routes.

my header

Digital Humanities 101 - 2013/2014 - Course 5 | 2013 28o

High values cargoes’ networks (S2)

A representation of the high values

cargoes’ networks (silk, pepper, spices,

sugar, wood, metal, cotton, slaves...) The

objective is to model the network for

trading pepper, cotton, salt, slaves ...

from their countries of origin. This project

can be easily divided into several

subprojects focusing on one good.

my header

Digital Humanities 101 - 2013/2014 - Course 5 | 2013 29o

Pilgrimage (S3)

Pilgrimage from Venice to Jerusalem.

Testimonies are a great source and

important source of information. The idea

here is to extract the information from a

pilgrim about the trip on board of a

Venetian galley and to model the trip.

my header

Digital Humanities 101 - 2013/2014 - Course 5 | 2013 30o

Concurrent networks (S4)

A representation of the concurrent trades

at sea (Genovese, Pisano, Catalans,

Spanyards...) Everyone has an archenemy.

Venice had some for quite some time and

the major one was Genoa. The objective

here is to localize the main ports and

stopovers and to model their shipping

lanes.

my header

Digital Humanities 101 - 2013/2014 - Course 5 | 2013 31o

Algorithmmodels for maritime routes (S5)

Algorithm models for maritime routes.

The objective is to model itineraries

automatically when the stopovers are

known and to add collateral data such as

winds, currents, speeds known for the

ships used, etc.

my header

Digital Humanities 101 - 2013/2014 - Course 5 | 2013 32o

Route planner (S6)

A Mediterranean route planner Based on

the data available about the Venetian

ships, can we built a Mediterranean route

planner ? If I am in Corfu in june 1342

and want to get to Constantinople, when

can I take a boat ?

my header

Digital Humanities 101 - 2013/2014 - Course 5 | 2013 33o

Financial networks (F)

my header

Digital Humanities 101 - 2013/2014 - Course 5 | 2013 34o

Financial networks (F1)

The objective of this project is to model

the the complexity of the market and the

incoming/leaving flows of money in the

Venetian empire.

my header

Digital Humanities 101 - 2013/2014 - Course 5 | 2013 35o

Printing industry (P)

my header

Digital Humanities 101 - 2013/2014 - Course 5 | 2013 36o

Venetian prints (P1)

Mapping the venetian prints in Europe

Quantitative outlook through mining of

online catalogues. What was printed and

when ? Where is it now ? Query online

catalogs for Venetian printed old books

(i.e. before 1797), build a database out of

that. Make the database accessible via a

geomap, and add a time slider. What can

you conclude about Venetian printing

industry on the long run ?

my header

Digital Humanities 101 - 2013/2014 - Course 5 | 2013 37o

Mapping the printing industry inside Venice (P2)

Mapping the printing industry inside

Venice Take de’ Barbari’s map, make it

interactive with information about the

position of the different printing shop,

academies and other places of culture.

my header

Digital Humanities 101 - 2013/2014 - Course 5 | 2013 38o

Coevolution of the city with its environment (E)

my header

Digital Humanities 101 - 2013/2014 - Course 5 | 2013 39o

Acque Alte (E1)

A representation of the Acque Alte. How

can we model the rising level and the

floods in Venice ?

my header

Digital Humanities 101 - 2013/2014 - Course 5 | 2013 40o

The Plague (E2)

Venice and the plague The plague’s

epidemics have been strong during the

Middle-Age and Venice as a big city has

been hit badly. The idea is to visualize the

propagation of the disease into town as

well as the major changes the Venetian

administration in order to handle the

epidemics (quarantine, doctors,

lazaretto...).

my header

Digital Humanities 101 - 2013/2014 - Course 5 | 2013 41o

Life in Venice (L)

my header

Digital Humanities 101 - 2013/2014 - Course 5 | 2013 42o

Demography (L1)

Representation of the demographic

evolution. Venice was one of the most

populated cities during the Middle-Age. A

few information is available. How did

Venice grow ? Where are the major

incidents ?

my header

Digital Humanities 101 - 2013/2014 - Course 5 | 2013 43o

Famous characters (L2)

Following a famous character in Venice.

What are the differences between the

Venice of Goldoni and the Venice of

Byron ? What were the building they

could have visited, where they were

meeting friends, hanging out. Can we

follow them into town ?

my header

Digital Humanities 101 - 2013/2014 - Course 5 | 2013 44o

Venetian cryptographies (L3)

Spies, code-crackers and ciphering Some

of the first code-crackers were working in

Venice, as Giovanni Soro at the beginning

of the 16th century, known as the father

of modern cryptography. What did

ciphers look like at the time in Venice ?

How and when were they used ?

my header

Digital Humanities 101 - 2013/2014 - Course 5 | 2013 45o

Visual representations of power (L4)

Visual representation of power : public

ceremony and the enforcement of social

hierarchy Get a scholarly understanding of

the socio-political implications of public

ceremonies via literature. Select

meaningful paintings (or other sources),

and build a visual explanation of (some

of) these events. The project could do

comparisons or highlights of

relations/differences.

my header

Digital Humanities 101 - 2013/2014 - Course 5 | 2013 46o

A Facebook of the Venetian elite (L5)

A Facebook of the Venetian elite Based

on pictorial and textual source, recreate a

database of the Venetian elite, with

images of all the most important

characters.

my header

Digital Humanities 101 - 2013/2014 - Course 5 | 2013 47o

You can invent your own projects

my header

Digital Humanities 101 - 2013/2014 - Course 5 | 2013 48o

In the last course we learn how to digitize documents.Today we are going to learn how to code the content of adocument in a structured format. Next week we will seehow we can automatise such kind of encoding throughpattern recognition.

my header

Digital Humanities 101 - 2013/2014 - Course 5 | 2013 49o

The topic of today is the transformation of an image intoinformation : A datafication process.

my header

Digital Humanities 101 - 2013/2014 - Course 5 | 2013 50o

A special form of encoding is done by the palaeographerswhen they transcribe document and produce criticaleditions.

my header

Digital Humanities 101 - 2013/2014 - Course 5 | 2013 51o

They are many kinds of editions, with different focuses.Some focus only on the textual content (the immaterialpart), others describe also aspects of the document itself(the material part). It all depends on the goal andexpected usages of the transcription.

my header

Digital Humanities 101 - 2013/2014 - Course 5 | 2013 52o

In this course, we will take a more general view on thisproblem, by introducing the concept of regulatedrepresentation.

my header

Digital Humanities 101 - 2013/2014 - Course 5 | 2013 53o

Most documents are regulated representations

my header

Digital Humanities 101 - 2013/2014 - Course 5 | 2013 54o

A regulated representation is a representation governedby a set of production and usage rules.

my header

Digital Humanities 101 - 2013/2014 - Course 5 | 2013 55o

Examples of regulated representations

•A list of names

•An accounting table

•A family tree

•A map of a region

•A Census

my header

Digital Humanities 101 - 2013/2014 - Course 5 | 2013 56o

Maps as regulated representations

•There are conventional presentation rules to follow when creating a map,

like indicating the scale or the direction of the North and conventional

methods to follow in order to create the map contents.

• In terms of usage, one must learn how to read a map. This map-reading

skill also involves many related skills for handling the map, orientating

oneself in front of it, etc. These skills are either taught or learnt by

imitation.

my header

Digital Humanities 101 - 2013/2014 - Course 5 | 2013 57o

Information is encoded using a given method anddecoded using another method

my header

Digital Humanities 101 - 2013/2014 - Course 5 | 2013 58o

Using regulated representation is an act ofcommunication. Regulated representations impose astructure to create a channel. On this channel someinformation can be transmitted.

my header

Digital Humanities 101 - 2013/2014 - Course 5 | 2013 59o

Regulated representations change over time.

my header

Digital Humanities 101 - 2013/2014 - Course 5 | 2013 60o

Regulated representation change over time

•They usually tend to become more regular.

•The general process of this regulating tendency is the transformation of

conventions into a mechanisms•The regulation usually proceeds in two consecutive steps :•mechanizing the representation production rules

•mechanizing its conventional usages.

•Ultimately, through this process, regulated representations tend to

become machines

my header

Digital Humanities 101 - 2013/2014 - Course 5 | 2013 61o

1360

my header

Digital Humanities 101 - 2013/2014 - Course 5 | 2013 62o

1520

my header

Digital Humanities 101 - 2013/2014 - Course 5 | 2013 63o

1550

my header

Digital Humanities 101 - 2013/2014 - Course 5 | 2013 64o

1572

my header

Digital Humanities 101 - 2013/2014 - Course 5 | 2013 65o

1576

my header

Digital Humanities 101 - 2013/2014 - Course 5 | 2013 66o

1765

my header

Digital Humanities 101 - 2013/2014 - Course 5 | 2013 67o

1829

my header

Digital Humanities 101 - 2013/2014 - Course 5 | 2013 68o

1859

my header

Digital Humanities 101 - 2013/2014 - Course 5 | 2013 69o

1910

my header

Digital Humanities 101 - 2013/2014 - Course 5 | 2013 70o

2006

my header

Digital Humanities 101 - 2013/2014 - Course 5 | 2013 71o

2006

my header

Digital Humanities 101 - 2013/2014 - Course 5 | 2013 72o

Howmaps have becomemachines

•From a tool to machine : By becoming a machine, maps have

internalized their own usage rules. As machines, they offer much more

possibilities than traditional maps. However, these various new modes of

usage are explicitly programmed.

•As maps became machines, they are progressively merged into a global

mechanic system in which a multitude of maps became aggregated into a

single one. As regulated representations get more regular, they tend to

aggregate into unified systems.

my header

Digital Humanities 101 - 2013/2014 - Course 5 | 2013 73o

Howmaps have becomemachines

•The relation of maps to time changes during the mechanization process.

When the map gets fully mechanized, the image of a map becomes just a

transitory state that can be automatically updated at any moment to

reflect more accurately the state of the earth.

•Mechanization changes where the value lies. What is of value in the new

associated economy is not the map contents but the traces of usage left

by the map readers.

my header

Digital Humanities 101 - 2013/2014 - Course 5 | 2013 74o

In the remaining part of this course, we are going toconsider only textual documents (we will talk about mapsand other kinds of objects in another course)

my header

Digital Humanities 101 - 2013/2014 - Course 5 | 2013 75o

Regulated textual document are characterized by specificlayout or internal structuring.

my header

Digital Humanities 101 - 2013/2014 - Course 5 | 2013 76o

my header

Digital Humanities 101 - 2013/2014 - Course 5 | 2013 77o

Doris Raines transcriptions of Venetian testaments

my header

Digital Humanities 101 - 2013/2014 - Course 5 | 2013 78o

Each family of documents is characterised by a commonstructure. Each document is characterised by specifictextual content.

my header

Digital Humanities 101 - 2013/2014 - Course 5 | 2013 79o

The printing press revolution is an important step in thehistory of regulated textual representations

my header

Digital Humanities 101 - 2013/2014 - Course 5 | 2013 80o

Principle of the printing press

my header

Digital Humanities 101 - 2013/2014 - Course 5 | 2013 81o

Letter punches

my header

Digital Humanities 101 - 2013/2014 - Course 5 | 2013 82o

Set of matrices typefoundry

my header

Digital Humanities 101 - 2013/2014 - Course 5 | 2013 83o

Set of matrices typefoundry

my header

Digital Humanities 101 - 2013/2014 - Course 5 | 2013 84o

HandMould

my header

Digital Humanities 101 - 2013/2014 - Course 5 | 2013 85o

Single Garamond type

my header

Digital Humanities 101 - 2013/2014 - Course 5 | 2013 86o

Case of type

my header

Digital Humanities 101 - 2013/2014 - Course 5 | 2013 87o

Composing stick

my header

Digital Humanities 101 - 2013/2014 - Course 5 | 2013 88o

Form ready for printing

my header

Digital Humanities 101 - 2013/2014 - Course 5 | 2013 89o

Printing press chronology (beginnings)

•Woodblock printing (China, VII c.e.)

•Cast metal movable type (Korea XIII c.e.)

•Paper starts to be used in Europe (rag paper), from Asia. First paper

mills (XIV)

•Block printing in Europe, esp. cheap devotion publications (beginning

XV) / Block book

my header

Digital Humanities 101 - 2013/2014 - Course 5 | 2013 90o

Biblia Pauperum,Wood Blocks, Nomovable types

my header

Digital Humanities 101 - 2013/2014 - Course 5 | 2013 91o

1350

my header

Digital Humanities 101 - 2013/2014 - Course 5 | 2013 92o

1456

my header

Digital Humanities 101 - 2013/2014 - Course 5 | 2013 93o

1486

my header

Digital Humanities 101 - 2013/2014 - Course 5 | 2013 94o

1493

my header

Digital Humanities 101 - 2013/2014 - Course 5 | 2013 95o

1499

my header

Digital Humanities 101 - 2013/2014 - Course 5 | 2013 96o

1521

my header

Digital Humanities 101 - 2013/2014 - Course 5 | 2013 97o

1564

my header

Digital Humanities 101 - 2013/2014 - Course 5 | 2013 98o

1786

my header

Digital Humanities 101 - 2013/2014 - Course 5 | 2013 99o

Printing press chronology (end)

• Iron press printing (Stanhope press, 1800)

•Rotary printing press (1818, Napier)

•Electrotyping (1838)

•Type-composing machine (1841)

• Industrial paper made from wood pulp (1870)

•Linotype machine (1886)

•First Xerox inkjet printer (1955)

•First 3d printer (1984)

my header

Digital Humanities 101 - 2013/2014 - Course 5 | 2013 100o

How to encode the content of a document ?

my header

Digital Humanities 101 - 2013/2014 - Course 5 | 2013 101o

Textual document aremultidimensional

• the linguistic dimension (text, grammatical rules)

• the semantic dimension (what the word means)

• the literary dimension (style, rhetorical features)

• the graphemic dimension (the kind of letter forms used to represent sounds)

• the iconic dimension (the ornaments in the document)

• the codicological dimension (the study of the manuscript itself)

•All these dimensions can be studied separately (cf. Elena Pierazzo on Digital Scholarly

Editing http://www.elenapierazzo.org/)

my header

Digital Humanities 101 - 2013/2014 - Course 5 | 2013 102o

Coding the content of a document depends of thepurpose of a study

my header

Digital Humanities 101 - 2013/2014 - Course 5 | 2013 103o

A short introduction to XML

my header

Digital Humanities 101 - 2013/2014 - Course 5 | 2013 104o

What is XML and what is it good for ?

my header

Digital Humanities 101 - 2013/2014 - Course 5 | 2013 105o

What is XML and what is it good for ?

my header

Digital Humanities 101 - 2013/2014 - Course 5 | 2013 106o

What is XML

•XML stands for eXtensibile Markup Language

•To write in XML you write text with tags : ¡atag¿ my text ¡/atag ¿

•This can be done in any text editor.

•XML is a W3C recommandation

my header

Digital Humanities 101 - 2013/2014 - Course 5 | 2013 107o

4 characteristics of XML

•XML is used to describe data, not to display them. XML does nothing. It

describes.

•XML tags are not predefined. You can define your own tags. This gives you a

lot of freedom to describe the structure you want to describe.

•When you are satisfied with your structure, you can fix our XML language by

writing a DTD (Document Type Description). Thus, XML permits both

fluidity and then rigor.

•XML is designed to be self-descriptive and easily readable. It is used to write

pivotal descriptions in production chains.

my header

Digital Humanities 101 - 2013/2014 - Course 5 | 2013 108o

Genealogy of XML

• In the 50s, the first computers could not communicate with one another, if

they were from different brands.

• In the 60s, IBM creates GML (Generalized Markup Language) to enable data

exchanges and make the data structure explicit. This is a great success. It

becomes a standard : SGML (Standard Generalized Markup Language). The

US fed gov. adopts it.

• In the 90s, Tim Berners-Lee at CERN creates the HTML language using a

subset of SGML. HTML get specialized in displaying data but does not

impose a standard way for describing data. A group of researchers imagines

another language to do this. The first version of XML is ready in 1998.

my header

Digital Humanities 101 - 2013/2014 - Course 5 | 2013 109o

HTML vs XML

•XML is a markup language like HTML.

•XML is not a replacement of HTML. The two languages have different goals.

•XML is for the transport and the description of structured data.

•XML does nothing. It just describes.

•XML is like a database in plain text.

my header

Digital Humanities 101 - 2013/2014 - Course 5 | 2013 110o

Structure of an XML file

my header

Digital Humanities 101 - 2013/2014 - Course 5 | 2013 111o

XML element

my header

Digital Humanities 101 - 2013/2014 - Course 5 | 2013 112o

XML example

my header

Digital Humanities 101 - 2013/2014 - Course 5 | 2013 113o

With XML, you can create your own tags.

my header

Digital Humanities 101 - 2013/2014 - Course 5 | 2013 114o

The header specifies the XML version and the encoding

my header

Digital Humanities 101 - 2013/2014 - Course 5 | 2013 115o

An XML file is like a tree

my header

Digital Humanities 101 - 2013/2014 - Course 5 | 2013 116o

Is this a problem ?

my header

Digital Humanities 101 - 2013/2014 - Course 5 | 2013 117o

DTD (Document Type Description)

•A well-formed XML document follows

the general rules of XML syntax.

•A valid XML document follows the

specific rules written in a DTD

(Document Type Description)

my header

Digital Humanities 101 - 2013/2014 - Course 5 | 2013 118o

When to use a DTD

•To use a DTD is not mandatory.

•A DTD permits to agree on common XML dialect.

•Some software permit to check whether an XML file is valid compared to a

given DTD.

my header

Digital Humanities 101 - 2013/2014 - Course 5 | 2013 119o

TEI (Text Encoding Initiative) is a family of special XMLdialects for describing the content of documents

my header

Digital Humanities 101 - 2013/2014 - Course 5 | 2013 120o

CSS and XSLT script

•The way an XML file is displayed can be

specified in a CSS stylesheet.

•A document can also be transformed

using an XSLT script. This is now the

recommended method

my header

Digital Humanities 101 - 2013/2014 - Course 5 | 2013 121o

From XML

my header

Digital Humanities 101 - 2013/2014 - Course 5 | 2013 122o

To XML

my header

Digital Humanities 101 - 2013/2014 - Course 5 | 2013 123o

XML is a pivotal format

my header

Digital Humanities 101 - 2013/2014 - Course 5 | 2013 124o

Debate : Is XML the right way for representing the contentof document. What are its strengths and weaknesses ?

my header

Digital Humanities 101 - 2013/2014 - Course 5 | 2013 125o

In two weeks we will learn about a complementarytechnique for encoding information : Semantic graphs.

my header

Digital Humanities 101 - 2013/2014 - Course 5 | 2013 126o