AGU-Tag-Set-v5.2

20
Contents Why XML is important Why new Tag Set? Why an AGU Tag Set? XML workflows Journal article Book chapter and book Eos article What’s new? Production Tag Set v5.2 Tag Library v5.2 – Schematron Validator v2.0 & documentation – Proofreader’s checklists PASS v2.0 MDDB Loader EASI Search Who is affected? Change management

Transcript of AGU-Tag-Set-v5.2

Page 1: AGU-Tag-Set-v5.2

Contents• Why XML is important• Why new Tag Set?• Why an AGU Tag Set?• XML workflows– Journal article– Book chapter and book– Eos article

• What’s new?– Production Tag Set v5.2

– Tag Library v5.2– Schematron Validator

v2.0 & documentation– Proofreader’s checklists– PASS v2.0– MDDB Loader – EASI Search

• Who is affected?• Change management

Page 2: AGU-Tag-Set-v5.2

XML is important because it…

• drives applications producing AGU pubs (HTML, PDF, print)• allows integration of all AGU pubs for search/retrieval• is the source for AGU products & services:

ToC, abstract pages, ‘cited by’ & linked refs, sp. sec. lists & CDs, RSS,cross-journal coll’s, CrossRef/A&I metadata deposits, indices, etc.

• can be repurposed for various media and devices• serves as the copy of record: preserves AGU pubs’ content

for posterity in internal and external archives – in a standard, non-proprietary, well-documented format

Page 3: AGU-Tag-Set-v5.2

Why new Tag Set?

• Scope: multiple genres of peer reviewed pubs– Journal (incl. newspaper) article– Book chapter– Book

• New metadata– Funding and sponsors– Country codes (more accurate statistics)– Mimetype/subtype for components

Page 4: AGU-Tag-Set-v5.2

Why new Tag Set? (cont’d)• New features– Accessibility compliance (Rehabilitation Act section 508)– MathML (functional math)– “Hooks” for semantic indexing– Enhanced reference models; compatibility with

CrossRef’s extended content types (books, conf. papers, reports, dissertations, patents, etc.)

– Maintainability: design informed by JATS (‘NLM DTD’)

Page 5: AGU-Tag-Set-v5.2

Why an AGU Tag Set?– Why not, e.g., EPUB 3 or NLM 3.0 or DocBook 5.0?– AGU Tag Set design reflects AGU requirements:• all doc. types (books, Eos, Space Weather non-technical)• AGU-specific metadata (subsets, sp. sections) & references• value-added features (book index, supplemental mat’l)

– A standard DTD is only a set of tags; customization is always needed (even NCBI customizes its own tag set!)

– Nominal conformance to standard does not guarantee control over data. Specification & validation are needed

Page 6: AGU-Tag-Set-v5.2

XML workflow: journal article

Page 7: AGU-Tag-Set-v5.2

XML workflow: chapter and book

Page 8: AGU-Tag-Set-v5.2

XML workflow: Eos article

Page 9: AGU-Tag-Set-v5.2

What’s new

• Production Tag Set v5.2• Tag Library v5.2• Validator v2.0 and documentation• Proofreader’s checklists• PASS v2.0• Loader XSLT v2.0 • Search (MarkLogic)

Page 10: AGU-Tag-Set-v5.2

Tag Set and Tag Library

• Tag Set v5.2 http://www.agu.org/dtd/AGU-Tag-Sets/AGU-Production-Tag-Set-v5.2/

• Tag Library v5.2 http://www.agu.org/dtd/AGU-Tag-Sets/AGU-Tag-Library-v5.2/

Use: XML conversion (vendors, eXtyles?), Web deliverables, database & search (MDDB, MarkLogic/EASI), submission & tracking systems, copy editors (metadata), int. & ext. dev., archives (Portico), … Documentation & Communication

Page 11: AGU-Tag-Set-v5.2

Schematron Validator v2.0

• Multiple genres (document types)– Journal (incl. newspaper) article– Book chapter– Book

• Different lifecycle phases– Papers in press (journal article)– Initial validation (journal article, book chapter)– Final validation (all genres)

Page 12: AGU-Tag-Set-v5.2

Schematron Validator v2.0 (cont’d)

• 9 top-level Schematrons, 27 modules• 350+ requirements checked• Works best in oXygen (context-sensitive), can

be compiled and integrated (PiP)• Can perform verifications against MDDB• Tag Set and Tag Library are public, Schematron

and its documentation are restricted

Page 15: AGU-Tag-Set-v5.2

Production Assistant (PASS) v2.0

• Identifies articles to be promoted (pub date - 2) • Checks integrity : non-textual components

present, sup. mat. files at the FTP site, etc.• Copies all article files to the article directory• Copies XML files to the MDDB loading directory• Keeps track of promoted articles• Currently works only with journal articles (-Eos)

Page 16: AGU-Tag-Set-v5.2

MDDB Loader

• Extraction– Extracts “heads & tails” (derives bib. and ref. XML)– Preserves chunks of original XML – self-citation,

runheads, references – used in dynamic web pages– Works with all genres: articles, chapters, books

• Insertion– Populates MDDB with bib. and ref. metadata– Enforces rules: dup. article Nos, chapter w/o book

Page 17: AGU-Tag-Set-v5.2

EASI search (MarkLogic)

• Transparently combine results from legacy, current, and new XML

• New UI• Integration of various genres into search

options and search results

Page 18: AGU-Tag-Set-v5.2

Who is affected?

• Editorial (Validator, oXygen, checklists)• Journal prod. (Validator, oXygen, PASS, Loader)• Book, Eos & SW n/t prod. (Validator, oXygen,

Loader, checklists, templates)• Vendors (XML conv., HTML transf., PDF comp.)• Secondary Services (MarkLogic: map’g, search)• EPubs (Tag Set, Validator, oXygen, PASS, Loader)

Page 19: AGU-Tag-Set-v5.2

Who is affected? (cont’d)

Page 20: AGU-Tag-Set-v5.2

Change mgmt and maintenance

• E-Pubs collects change requests from AGU staff and vendors, logs them in TRAC

• E-Pubs develops, tests, and deploys new releases in an orderly manner in coordination with other AGU departments and AGU vendors

• Documentation