Derek Feeley Director General and Chief Executive, NHSScotland.
Introduction to XML Schema John Arnett, MSc Standards Modeller Information and Statistics Division...
-
Upload
shanon-willis -
Category
Documents
-
view
218 -
download
0
Transcript of Introduction to XML Schema John Arnett, MSc Standards Modeller Information and Statistics Division...
Introduction to XML Schema
John Arnett, MScStandards ModellerInformation and Statistics DivisionNHSScotlandTel: 0131 551 8073 (x2073)mailto:[email protected]://isdscotland.org/xml
Contents
• Introduction• Document Type Definitions -
reminder• W3C Schema
– Schema Structures– Built-In Types
• Summary• Find Out More
Introduction
• Schema– a diagram, plan or framework– XML – a document that
describes an XML document.
Introduction
• Purpose– Data validation– Contract– System documentation– Processing information
Introduction
• Schema Data Validation– Element and attribute structure– Element ordering– Value constraints
•Built-in data types•Size and pattern constraints •Enumerations
– Uniqueness constraints
Introduction
• Schema Languages– Document Type Definitions
(DTD’s)– W3C XML Schema– OASIS RELAX NG– Schematron
Document Type Definitions
• DTD Benefits<!ELEMENT Record (FamilyName, GivenName, Sex, DateOfBirth)><!ELEMENT FamilyName (#PCDATA)><!ELEMENT GivenName (#PCDATA)><!ELEMENT Sex (#PCDATA)><!ELEMENT DateOfBirth (#PCDATA)><!ATTLIST Record recordId CDATA #REQUIRED>
– Easy to understand and implement– Lightweight alternative to schemas
Document Type Definitions
• DTD Limitations– Use non-XML syntax– Only limited support for data
typing and namespaces– Difficult to extend
W3C Schema
• W3C Recommendation– XML Schema Part 0: Primer
•Introduction (guidance)– XML Schema Part I: Structures
•defines schema components– XML Schema Part 2: Datatypes
•defines built-in datatypes and their restrictions
W3C Schema Structures
• Most commonly used structures: –elements and attributes –simpleTypes–complexTypes– model groups–minOccurs and maxOccurs–annotation and documentation–schema and namespaces
W3C Schema Structures
• element and attribute– Basic building blocks of documents
<element name=“Record”><complexType>
<sequence><element name=“FamilyName” type=“string”/><element name=“GivenName” type=“string”/><element name=“Sex” type=“token”/><element name=“DateOfBirth” type=“date”/>
</sequence><attribute name=“recordId” type=“integer”/>
</complexType></element>
W3C Schema Structures
• element and attribute– valid instances of Record element
<Record recordId=“1”><FamilyName>Arnett</FamilyName><GivenName>John</GivenName><Sex>M</Sex><DateOfBirth>1963-06-01</DateOfBirth>
</Record><Record recordId=“2”>
<FamilyName>Smith</FamilyName><GivenName/><Sex>FEMALE</Sex><DateOfBirth>1971-04-11</DateOfBirth>
</Record>
W3C Schema Structures
• element and attribute– invalid Record element instance
<Record recordId=“1”>Mr<Surname>Arnett</Surname><GivenName>John</GivenName><Sex>M</Sex><DateOfBirth>06-Jan-63</DateOfBirth>
</Record>
W3C Schema Structures
• simpletype Definitions– Define element content– Character data only - no nested
(child) elements permitted– No attributes permitted– Always derived from a built-in
types (using restriction)
W3C Schema Structures
• simpletype definition examples<simpleType name=“TextType”>
<restriction base=“string”><minLength value=“1”/><maxLength value=“35”/>
</restriction></simpleType>
<simpleType name=“GenderType”><restriction base=“token”>
<enumeration value=“M”/><enumeration value=“F”/><enumeration value=“NK”/>
</restriction></simpleType>
W3C Schema Structures
• complexType Definitions– Define element content– Child elements and character
data permitted–attributes permitted
W3C Schema Structures
• complexType definition examples<complexType name=“DemographicsStructure”>
<sequence><element name=“FamilyName” type=“TextType”/><element name=“GivenName” type=“TextType”/><element name=“Sex” type=“GenderType”/><element name=“DateOfBirth” type=“date”/>
</sequence><attribute name=“recordId” type=“integer”/>
</complexType>
<element name=“Record” type=“DemographicsStructure”/><element name=“Person” type=“DemographicsStructure”/><element name=“Client” type=“DemographicsStructure”/>
W3C Schema Structures
–sequence•elements must occur in the order specified
–choice•one of several child elements must be selected
–all•0 or 1 occurences in any order
• Model groups
W3C Schema Structures
• Model group examples<complexType name=“DemographicsStructure”>
<sequence><element name=“FamilyName” type=“TextType”/><element name=“GivenName” type=“TextType”/><element name=“Sex” type=“GenderType”/><choice>
<element name=“DateOfBirth” type=“date”/><element name=“Age” type=“integer”/>
</choice></sequence><attribute name=“recordId” type=“integer”/>
</complexType>
W3C Schema Structures
• Model groups– Valid instances of Record element
<Record recordId=“1”><FamilyName>Arnett</FamilyName><GivenName>John</GivenName><Sex>M</Sex><DateOfBirth>1963-06-01</DateOfBirth>
</Record><Record recordId=“2”>
<FamilyName>Smith</FamilyName><GivenName>Jane</GivenName><Sex>F</Sex><Age>28</Age>
</Record>
W3C Schema Structures
• minOccurs and maxOccurs – control the occurence of
element instances•minOccurs=“0”
–occurrence is optional•maxOccurs=“unbounded”
–multiple occurences allowed
– may be applied to any child element, sequence or choice
W3C Schema Structures
• minOccurs and maxOccurs examples<complexType name=“DemographicsStructure”>
<sequence><element name=“FamilyName” type=“TextType”/><element name=“GivenName” type=“TextType”
maxOccurs=“unbounded”/><element name=“Sex” type=“GenderType”
minOccurs=“0”/><choice>
<element name=“DateOfBirth” type=“date”/><element name=“Age” type=“integer”/>
</choice></sequence><attribute name=“recordId” type=“integer”/>
</complexType>
W3C Schema Structures
• minOccurs and maxOccurs– Valid instances of Record element
<Record recordId=“1”><FamilyName>Arnett</FamilyName><GivenName>John</GivenName><GivenName>Gordon</GivenName><Sex>M</Sex><DateOfBirth>1963-06-01</DateOfBirth>
</Record><Record recordId=“2”>
<FamilyName>Smith</FamilyName><GivenName>Jane</GivenName><Age>28</Age><!-- Optional “Sex” element missing -->
</Record>
W3C Schema Structures
• Namespaces– W3C namespace http//www.w3.org/2001/XMLSchema•element, complexType, sequence, etc
–targetNamespace•Optional•User defined •One per schema document
W3C Schema Structures
• schema with namespaces<xsd:schema=“PersonalRecord” targetNamespace=“http://www.person.rec” xmlns:xsd=“http//www.w3.org/2001/XMLSchema”>
<!-- Type definitions, etc with namespace prefixes -->
<xsd:complexType name=“RecordStructure”>...
</xsd:complexType><xsd:simpleType name=“TextType”/>
...</xsd:complexType><xsd:simpleType name=“GenderType”/>
...</xsd:complexType>
</xsd:schema>
W3C Schema Structures
• annotation and documentation<xsd:simpleType name=“GenderType”>
<xsd:annotation><xsd:documentation>The sex of an individual
for administrative purposes.</xsd:documentation><xsd:annotation><xsd:restriction base=“token”>
<xsd:enumeration value=“M”/><xsd:enumeration value=“F”/><xsd:enumeration value=“NK”/>
</xsd:restriction></xsd:simpleType>
W3C Schema Structures
• annotation and documentation<xsd:simpleType name=“GenderType”>
<xsd:restriction base=“token”><xsd:enumeration value=“M”/><xsd:enumeration value=“F”/><xsd:enumeration value=“NK”>
<xsd:annotation><xsd:documentation>This is used when
the sex cannot be determined for physical reasons, e.g. a new born baby</xsd:documentation>
<xsd:annotation></xsd:enumeration>
</xsd:restriction></xsd:simpleType>
Built-in Simple Types
• 44 built-in simple types - most are atomic
• Used directly in schemas or used to create user-defined simple types
Built-in Simple Types
• String-based types–string–normalizedString–token
Built-in Simple Types
• Numeric Types–float and double–decimal–integer
Built-in Simple Types
• Date and Time Types–date–time–dateTime–gYear, gMonth, gDay–duration
Built-in Simple Types
• Others–boolean–base64Binary and hexBinary–anyURI
Built-in Simple Types
• Facets– length– minLength– maxLength– minExclusive– minInclusive– maxExclusive– minExclusive
– totalDigits– fractionDigits– whiteSpace– pattern– enumeration
Built-in Simple Types
• Length facets
<xsd:simpleType name=“TextType”><xsd:restriction base=“string”>
<xsd:minLength value=“1”/><xsd:maxLength value=“35”/>
</xsd:restriction></xsd:simpleType>
<xsd:element name=“Comment” type=“TextType”/>
<Comment>This is a valid value</Comment><Comment/><Comment>This is an invalid value because it contains more than 35 characters</Comment>
Built-in Simple Types
• enumeration facet
<xsd:simpleType name=“GenderType”><xsd:restriction base=“token”>
<xsd:enumeration value=“M”/> <xsd:enumeration value=“F”/> <xsd:enumeration value=“NK”/>
</xsd:restriction></xsd:simpleType>
<xsd:element name=“Sex” type=“GenderType”/>
<Sex>NK</Sex><Sex>Male</Sex>
Built-in Simple Types
• pattern facet<xsd:simpleType name="PostCodeType">
<xsd:restriction base="xsd:string"><xsd:pattern value="[A-Z]{1,2}[0-9R][0-9A-Z]?
[0-9][A-Z]{2}"/></xsd:restriction>
</xsd:simpleType>
Advanced Features
• Multi-document schemas• Complex type derivation• Reusable groups• Element substitution• Schema redefinition• Identity constraints• Schema design
Summary
• Used to validate structure and values XML instance documents
• Uses XML syntax• W3C Recommendation specifies
data structures and built-in types• Supports namespaces• Has many advanced features, incl.
several extensibilty mechanisms
Find Out More
• XML Schema Part 0: Primer– www.w3.org/TR/xmlschema-0/
• XML Schema Part 0: Structures– www.w3.org/TR/xmlschema-1/
• XML Schema Part 0: Datatypes– www.w3.org/TR/xmlschema-2/