Using Schematron for appropriate layer validation: A case study

3
Alexander (‘Sasha’) Schwarzman, AGU Balisage 2011: The Markup Conference, Montréal, Canada Page 1 of 3 ([email protected]) August 2–5, 2011 Using Schematron for Appropriate Layer Validation: a Case Study Alexander (‘Sasha’) Schwarzman, AGU ([email protected]) Balisage 2011: The Markup Conference, Montréal, Canada August 2 – 5, 2011 Appropriate layer validation—advantages Even the most “Prussian” DTD cannot enforce all business rules, data types, and house style Rules-based checking needed anyway May use a “Californian” DTD, such as JATS: de facto industry standard adopted by publishers, conversion and composition vendors, archives, etc. Can use tools developed for JATS: Preview XSLT stylesheets, EPUB conversion processes, etc. Why Schematron? Multiple genres (document types) Journal article Book chapter Book Newspaper article Different lifecycle phases Papers in press (journal article) Initial validation (journal article, book chapter) Final validation (all genres) Journal article

Transcript of Using Schematron for appropriate layer validation: A case study

Page 1: Using Schematron for appropriate layer validation: A case study

Alexander (‘Sasha’) Schwarzman, AGU Balisage 2011: The Markup Conference, Montréal, Canada Page 1 of 3 ([email protected]) August 2–5, 2011

Using Schematron for Appropriate Layer Validation: a Case Study

Alexander (‘Sasha’) Schwarzman, AGU ([email protected])

Balisage 2011: The Markup Conference, Montréal, Canada August 2 – 5, 2011

Appropriate layer validation—advantages Even the most “Prussian” DTD cannot enforce all business rules, data types, and house style

Rules-based checking needed anyway

May use a “Californian” DTD, such as JATS: de facto industry standard adopted by publishers, conversion and composition vendors, archives, etc.

Can use tools developed for JATS: Preview XSLT stylesheets, EPUB conversion processes, etc.

Why Schematron? Multiple genres (document types)

Journal article

Book chapter

Book

Newspaper article

Different lifecycle phases

Papers in press (journal article)

Initial validation (journal article, book chapter)

Final validation (all genres)

Journal article

Page 2: Using Schematron for appropriate layer validation: A case study

Alexander (‘Sasha’) Schwarzman, AGU Balisage 2011: The Markup Conference, Montréal, Canada Page 2 of 3 ([email protected]) August 2–5, 2011

Book chaper and book

Newspaper article

Page 3: Using Schematron for appropriate layer validation: A case study

Alexander (‘Sasha’) Schwarzman, AGU Balisage 2011: The Markup Conference, Montréal, Canada Page 3 of 3 ([email protected]) August 2–5, 2011

AGU Schematrons

FBA IJA FJA SWN FBK EOM EOS PIP IBA

AGUcontribs.sch ✓ ✓ ✓

✓ ✓ ✓ ✓ ✓ bibr-adhoc.sch ✓ ✓ ✓ ✓ ✓ ✓ ✓

bibr-ids.sch ✓ ✓ ✓ ✓ ✓ ✓ ✓

✓ bibr-italics.sch ✓ ✓ ✓ ✓ ✓ ✓ ✓

bibr-structures.sch ✓ ✓ ✓ ✓ ✓ ✓ ✓

✓ book-bookarticle.sch

book-meta.sch

✓ bookarticle-meta-final.sch ✓

bookarticle-meta.sch ✓

✓ common-back.sch ✓ ✓ ✓ ✓ ✓ ✓ ✓

common-final.sch ✓

✓ ✓ ✓ ✓ ✓ common-meta.sch ✓ ✓ ✓ ✓ ✓ ✓ ✓ ✓ ✓

common.sch ✓ ✓ ✓ ✓ ✓ ✓ ✓ ✓ ✓ dates.sch ✓

✓ ✓ ✓ ✓ ✓ ✓

eos-only.sch

✓ ✓ filetypes.sch ✓ ✓ ✓ ✓ ✓ ✓ ✓ ✓ ✓

global.sch ✓ ✓ ✓ ✓ ✓ ✓ ✓ ✓ ✓ index-codes.sch

index-terms.sch ✓ ✓ ✓

✓ journalarticle-meta-final.sch

✓ ✓

✓ ✓

journalarticle-meta.sch

✓ ✓ ✓

✓ ✓ ✓ journalarticle-tech.sch

✓ ✓

mddb-ws.sch ✓ ✓ ✓ ✓ ✓ ✓ ✓ ✓ ✓ names.sch ✓ ✓ ✓ ✓ ✓ ✓ ✓ ✓ ✓ print-final.sch ✓ ✓ ✓ ✓ ✓ ✓ ✓ ✓ ✓ ref-misc.sch ✓ ✓ ✓ ✓ ✓ ✓ ✓ ✓ ✓ setup.sch ✓ ✓ ✓ ✓ ✓ ✓ ✓ ✓ ✓

9 top-level Schematrons, 27 modules

350+ requirements checked

Perform Web Services-based verifications against relational metadata database

Check data quality, markup integrity, business rules, data types, and house style

Provide control over production processes

Work best in oXygen (context-sensitive), can be compiled and integrated into pipeline scripts

Paradigm shift: validation focus moves from XML parser to Schematron engine ☛ Content may be valid to the DTD but make no sense

☛ Semantic integrity now depends on Schematron

☛ Should each Schematron release be preserved and the version info added to metadata?

☛ Constraints on business partners: they must be Schematron-capable and have tools

☛ Schematron does not “fix” problems—people do! Processes & procedures must be defined

How to build a good Elicit, document, convey, and clarify the Requirements

Schematron

Ensure Schematron fits into your workflow

Modularize Schematron

Ensure that individual Schematron rules aren’t in conflict

Optimize Schematron performance

Employ XSLT 2.0

Test, test, test

Cultivate Schematron & XSLT 2.0 expertise in-house