Making XML Work (really using it correctly) For DIG-IT! Daniel Dodge West Group October 11, 2000.

Post on 26-Dec-2015

217 views 2 download

Tags:

Transcript of Making XML Work (really using it correctly) For DIG-IT! Daniel Dodge West Group October 11, 2000.

Making XML Work(really using it correctly)

For

DIG-IT!Daniel Dodge

West Group

October 11, 2000

Agenda

• Some experiences with XML at West Group– ranges of implementation– practical discoveries

• Principles to follow

• Tips for getting XML to work for you– methods that work– pitfalls to avoid

Bio(or, why should we listen to this guy?)

• Implemented systems using SGML when it was a four-letter word

• Building architecture through team formation– chemistry and expertise (get the right people together,

and you can do almost anything)

– formalization of method with the team

• Using process definition and control as key forces for requirements

XML at West Group

• Types of projects

• Game plan

• Hard lessons learned (so far)

Types of Projects

• Reengineering current editorial systems• Interchange among companies• Publishing acquired content

Reengineering

• Codes (statutes, administrative law, court rules, etc.)

• Cases• Analytical products• Required analysis of editorial systems and life-

cycle processes• West Group controls some or all of the content

Interchange

• One-way delivery from source company to a receiving company

• TLR (Thomson Legal & Regulatory) publishes legal information from around the world on WestLaw

• Currently have many formats, requiring multiple conversions

• Will use interchange models to define XML data format instead

Acquired Content

• Example: Public records such as corporate information, aircraft registrations, DMV, etc.

• No editorial intervention, only publishing (ownership of data is outside West Group)(contractual obligations for data integrity)

• Transformation from fixed-fielded or structured data into XML in several stages (media-neutral interchange and media-specific product)

Game Plan

• Goal: Reusable data objects• Requirements: Common data models for common

parts, information-type specific models for objects that have business purpose

• Develop an information architecture• Test by building implementations for all types of

projects

Results (so far)

• Burning the candle at both ends (so many levels of data and architecture and process)

• Eventually a middle ground is found to implement using current technology

• Prototypes finish completing the requirements for the business process until it can replace the current one (testing)

Lessons Learned (so far)

• Defining data requirements is the key to a useful data model

• The data owners know more about the data than all the data architects combined (humility is essential)

• With training, many people can can write DTDs• Use interchange DTDs to help processes move data

from legacy to XML-based systems• Prototype early and often

Principles to Follow

• focus on the task (it takes endurance)

• spiral, rapid-prototyping development cycles

• eliminate obstacles by dealing with stakeholder issues

What You’re Building

• Information freeway access to your super storage system (a web server with dynamic content)

• Factory for cranking out topic packets of data to requestors (via HTTP)

So you want to use XML

• Delivery or receipt in XML format makes life easier

• Platform independent

• Language independent (as long as everyone agrees on a dictionary)

• Organization independent (depending on policies and security)

Interchange

• Literally, exchange

• Reason for interchange

• Agreement to deliver/accept data in interchange format

• You get data that you understand

Data Flow Out

conversionprocess A

tape

disk

conversionprocess B

XMLdata

XMLdata

Data Flow In

transformationprocess

localdata

XMLdata

HTMLdata

Delivering XML Data

• You should offer:– DTD or Schema– sample XML instances (valid, of course)– friendly contact person (who has access to

prose documentation, email history of state of delivery system, and so on)

• You might be asked for more, but that should do it for advance preparation time

Receiving XML Data

• What should you require, versus accept– description of delivery system– DTD or Schema– element and attribute descriptions

• But it would be nice to have…– prose documentation– versioning

Common Understanding

• Same vocabulary for data components– character, column, flag, attribute– unit, item, field, element– record, document, well-formed instance

• Same vocabulary for processes– update or replace or obsolete– analysis, development, maintenance

Examples of Terms

• Current systems:– Character– Column– Flag

• XML data model:– Attribute (with list of values)– Empty element (containing only attributes)

Example XML

<record updated=“Y”> … </record>

<status new=“Y” approved=“N”/>

Examples of Terms

• Current systems:– Unit– Item– Field

• XML data model:– element (with or without attributes)– wrapper element

Example XML

<name>J. Smith</name>

<record><name>Joe Smith</name><license.number>A102-234</license.number>

</record>

Mapping to Equal Precision

• Rule #1:You can’t change the laws of thermodynamics.

• Rule #2:There is no other rule than Rule #1.

Analyzing the Data

• Map each data component from source to an element or attribute in the XML form

• Create a new name for each unique piece of data

• Use only existing structure or markup

Interchange or Deliver(a pop quiz)

• Is this name instance:<name>Joe Smith</name>

• equal to:<name><first>Joe</first><last>Smith</last></name>

?

It depends...

• Going from here:<name><first>Joe</first><last>Smith</last></name>

to here:<name>Joe Smith</name>

is OK.• You can’t go the other way (without some

magic…).

More Examples of Terms

• Is name element for person name, or company name?

• How are fields delimited?

• What identifies the order of tree-structured data?

Record or Logical Document

• Basic unit of delivery (something that can be tracked by version or current date)

• Exchange those logical documents, not parts of them

• May require building transaction processing into your input cycle for changes to the record

Writing Data Models

• Elements correspond to fields in database

• Attributes describe relationships

• Tree structure might be interpreted view, or present in existing markup

Agreements about Names

• Owner of data should develop the set of names for the data components

• Receiver of data should review the list of names and ask for any required changes.

Model Development

• Have your DTD writers make XML Schemas instead

• Make templates containing interface components (the interfaces to the other side’s data input requirements: XML well-formed data + stylesheet)

• Use templates with completed data analysis records for input to make data model

Follow a Method

• Analyze, then develop data model (not the other way around or you’ll be blind to the mistakes you made)

• Test with larger and larger sets of data– first one record (document)– then 1000– then all 100 million

Practical Politics

• Involve all your stakeholders (find unity through shared requirements)

• Find common names (exchange all your glossaries) and list servers (all sources where new data comes from)

• Build your community of shared data models

Summary

• Projects

• Plan

• Results

Just do it!

The fine print: Just in case there is a potential trademark infringment here, for the purpose of fair broadcast of intent, that the phrase above may not be interpreted as trademark infringement, etc., heretofore, and furthermore (you get the idea after a while, how this goes, and just follow along…)

That’s it...

Thanks to my comrades

• the osde gang

• cray and sgi

• my battle partners (Jolene) (and Francis) and team

• Barb, Kathy, Kathy, Jeff, and Eric and his crew

Connections

• Midwest SGML/XML Forum– website: www.midwest-sgml.org– next meeting: October 19 in Arden Hills

• email– work: daniel.dodge@westgroup.com

– home: infoeng@mm.com

• web page: www.ftinet.com/dodge