2 XML Basics

download 2 XML Basics

of 50

Transcript of 2 XML Basics

  • 8/14/2019 2 XML Basics

    1/50

    XML

    Basics

  • 8/14/2019 2 XML Basics

    2/50

    Topics

    Basic XML document structure and

    components

    Well-formed XML DTD DTD components Creating DTD and linking with XML

    document XML parsers Valid XML document

  • 8/14/2019 2 XML Basics

    3/50

    Harry Potter

    Ron

    Reminder

    Please, get your magicwand.

    "

    Root element

    Empty Element

    Attribute

    XML document structure

    Element

    Comment

    Processing instruction

    Entity Reference

  • 8/14/2019 2 XML Basics

    4/50

    Special markup characters

    &

    Use < for for >

    Use & for &

    Use ' for

    Use &quot for

  • 8/14/2019 2 XML Basics

    5/50

    Redefining XML

    An XML document is an information

    unit that can be viewed in two ways:

    as a linear sequence of charactersthat contain character data or

    markup or entity references

    or as an abstract data structure that

    is a tree of nodes.

  • 8/14/2019 2 XML Basics

    6/50

    Well-formed Constraints

    All XML elements must nest correctly.

    XML tags are case sensitive. The case of the

    start tag and and its corresponding end tag

    must match.

    All XML elements must be properly nested

    All XML documents must have one and only

    one Root element

    All the elements (other than the root)must

    have one and one parent.

  • 8/14/2019 2 XML Basics

    7/50

    Well-formed Constraints

    Attribute values must always be quoted

    Empty tags must end with a /.

    An XML document that confirms to theabove rules is called a Well formed

    XML document

  • 8/14/2019 2 XML Basics

    8/50

    Well-formed XML data conforms to the

    XML syntax specification, and includes no

    references to external resources (unlessa DTD is provided). It is comprised of

    elements that form a hierarchical tree,

    with a single root node (the document

    element).

    Well-formed XML defined

  • 8/14/2019 2 XML Basics

    9/50

    Example: XHTML

    Well-formed document:XHTML document

    This is a well-formed HTMLdocument


    and valid XHTML document.

  • 8/14/2019 2 XML Basics

    10/50

    Tool to check well-formed

    XHTML document HTML Tidy utility by W3C can be used to test

    for well-formed XHTML.

    It can also be used to convert HTMLdocument to XHTML document.

    Open source software.

    Command: tidy xml e filename.xhtml

  • 8/14/2019 2 XML Basics

    11/50

    Activity

    Test the ill-formed XHTML document using

    tidy and note down the errors.

  • 8/14/2019 2 XML Basics

    12/50

    More about the XML names XML names are names given for elements

    and attributes

    All XML names must begin with a letter or _

    or :.

    Letter could be any alphabets in english orany language supported by UNICODE.

    Only restriction is that it cannot be XML or

    xml or mix of case in the string xml.

    Patient

    DOCTOR

    Doctor:Patient

    Xml_Tag

    -Name

    12Street

    Legal Illegal

  • 8/14/2019 2 XML Basics

    13/50

    Element

    Basic building block of the XML document

    May have

    Entity referencesCommentsPIsCDATA section

    Character dataAttributesCharacter references

    The root element is also called the DocumentElement.

  • 8/14/2019 2 XML Basics

    14/50

    Attributes

    Attributes are used to attach theinformation about the element.

    Attribute is a name-value pair

    Attribute values can be any text, entityreference or character reference.

    Attribute values cannot contain special

    characters. Only one instance of attribute name is

    allowed.

  • 8/14/2019 2 XML Basics

    15/50

    Character references Characters that cannot be typed into a

    document straight away but must be displayed,can be represented as character references.

    Example: copy right symbol: , ScicomInfrastructure Pvt (India)Ltd

    Used for representing a single character. It is comprised of a decimal or hexa-decimal

    number between and ;

  • 8/14/2019 2 XML Basics

    16/50

    Entity references

    5 built in entity references < >etc.

    Apart from these 5 entities, number of otherentity references are also defined like

    etc.

  • 8/14/2019 2 XML Basics

    17/50

    CDATA section

    Character data that you dont want to be

    parsed can be kept in CDATA section.

    if(a>b && a

  • 8/14/2019 2 XML Basics

    18/50

    Comment and PI

    Comment can be given between Example:

    Processing instruction is used to pass some

    hints/files to the application along with the xml

    document.

    PI is given between two ?

    Example:

  • 8/14/2019 2 XML Basics

    19/50

    DTD

    Document Type Definition

    The DTD defines the structure of the xml

    document and how content is nested.

    An XML document is valid only when it iswell-formed and confirms to the DTD (or XML

    schema) defined for it.

    DTD defines the grammar rules for formingan XML document.

  • 8/14/2019 2 XML Basics

    20/50

    XML Parser

    XML parsers/processors are which check if theXML document is well-formed parser) or valid

    Non-validating parser: ensure that the XML

    document is well-formed. Validating parser: ensure that the XML

    document is

    Well-formed

    Valid

    Resolves external resources

  • 8/14/2019 2 XML Basics

    21/50

    xml doc

    xml parser

    xml application

    xml docreturn valid/invalid

    document

  • 8/14/2019 2 XML Basics

    22/50

    XML Parser available

    Apache Xerces-C(C++) Xerces-J(Java)

    IBM IBM 4C(C++) IBM4J (Java)

    Microsoft MSXML IE

    Oracle XML Parser for

    Java

    XML parser for Cand C++

    Sun

    JAXP and JAXBAPI

  • 8/14/2019 2 XML Basics

    23/50

    Ways of DTD with XML

    Internal

    Including DTD in the same file as XML file

    External Creating another file for DTD and linking it

    with XML file

    If both are provided, then if there are similar

    declarations, internal DTD takes preference.

  • 8/14/2019 2 XML Basics

    24/50

    Linking external DTD with XML

    2 ways to associate DTD with XML1.

    SYSTEM is used to explicitly specify thelocation of the DTD.

    Example:

    2. PUBLIC is used if the DTD is a

    standard and is shared by manyorganizations

  • 8/14/2019 2 XML Basics

    25/50

    The identifier is a name mapped to theactual location of DTD from whereeverybody shares the dtd.

    If the dtd is not accessible or available then

    like SYSTEM command dtd is obtainedfrom location. Example

  • 8/14/2019 2 XML Basics

    26/50

    Basic DTD declarations

  • 8/14/2019 2 XML Basics

    27/50

    ELEMENT Declaration

    1. Text Only: Specifies that this element can contain

    content that is text

    Example:

    VALID XML : ravi INVALID XML:ravi

  • 8/14/2019 2 XML Basics

    28/50

    ELEMENT Declaration1. Element Only:

    Specifies that this element can containelements as specified by the tag

    Example:

    VALID XML :

    ravi

    INVALID XML:

    ravinath

  • 8/14/2019 2 XML Basics

    29/50

    Order of elements

    Sequence list: , separated list The child elements must appear in the

    specified order

    Example:

    VALID XML :ravinath INVALID XML:ravi

    nath

  • 8/14/2019 2 XML Basics

    30/50

    Order of elements Choice list:

    | separated list The child elements can appear any order

    Example:

    VALID XML :

    ravi

    nath

    INVALID XML:

    ravi

  • 8/14/2019 2 XML Basics

    31/50

    Element Declaration1. Mixed content:

    Specifies that this element can contain mixture ofelements and text as specified

    Example:

    Valid XML:Mr. Mohan Lal

    172 Veera Apts., MG Road

    Cochin

    Kerala

    Or |

  • 8/14/2019 2 XML Basics

    32/50

    Element Declaration1. Anything:

    Specifies that this element can containany well-formed xml data

    Example: Valid XML:

    Mr. Mohan Lal172 Veera Apts., MG RoadCochinKerala

  • 8/14/2019 2 XML Basics

    33/50

  • 8/14/2019 2 XML Basics

    34/50

    Element Declaration

    Cardinality none: the absence of cardinality indicates

    one and only one

    *

    0 or more + 1 or more

    ?0 or 1

    Example:

  • 8/14/2019 2 XML Basics

    35/50

    ]>

    Mocha Java11.95

    Example:

    Embedded

    DTD withXML

  • 8/14/2019 2 XML Basics

    36/50

    Validating using XMLSpy

    We will use XML Spy 2.5 trial version for

    DTD-Validation.

    XML Spy is Microsoft XML Parser.

  • 8/14/2019 2 XML Basics

    37/50

    Activity

    Open a valid XML file with XMLSpy.

    Try adding an invalid element.

  • 8/14/2019 2 XML Basics

    38/50

    Attribute Declaration

    elementName ,attrName are compulsory

    type type specifies the type of the value theattribute can hold

    attDefault specifies whether an attributespresence is required or not. Also says how the

    parser must handle the attributes absence. Value specifies the default value

  • 8/14/2019 2 XML Basics

    39/50

    Attribute Types

    CDATA: text data (string) ID: valid xml name which is unique for each

    identifier for each instance of the current element.

    IDREF: a reference to the ID type IDREFS: List ofIDREFs separated by comma

    NMTOKEN: text data that can contain thecharacters limited to letters, digits, underscores,

    colons, periods and dashes

    NMTOKENS: a comma-separated list ofNMTOKENitems

  • 8/14/2019 2 XML Basics

    40/50

    Attribute Types

    ENTITY: name of the predefined entity.

    ENTITIES: a list of ENTITY names

    separated by white space charsNOTATION: used to map a reference to anotation-type declaration section that is

    declared else where in the DTD.

    Enumerated value/attribute choice list: a list

    of values that attribute value can have

  • 8/14/2019 2 XML Basics

    41/50

    Example1

  • 8/14/2019 2 XML Basics

    42/50

    Example1

    title CDATA #REQUIRED

    ISBN ID #REQUIREDSimilar IDREF #IMPLIED

    libno NMTOKEN #IMPLIED

    authors NMTOKENS #REQUIREDhardbound (YES|NO) "NO"

    source CDATA #FIXED "BOOK"

    > ]>

  • 8/14/2019 2 XML Basics

    43/50

    Powerful lessons in personal change

    Turning mistakes into stepping stonesfor success

  • 8/14/2019 2 XML Basics

    44/50

    Notation

    Notation is used to map a reference to a

    notation-type declaration section that is

    declared else where in the DTD.

    The notation is more important for the application

    outside the parser. Example:

    http://mysite.com/GIF_viewer.exe>

  • 8/14/2019 2 XML Basics

    45/50

    Entities

    Entity/Entities are replaceable contents whichhelps reduced re-type or reassign of the samecontent

    All the entities except predefined entities need

    to be declared.

    Entities can be classified in two ways: Depending on where it is used

    Depending on how it is parsed

  • 8/14/2019 2 XML Basics

    46/50

    Classification 1

    Two types of Entities: Parameter Entity: Entity reference within used

    within the DTD.

    General Entity: Entity reference within used

    within the XML document.

    It is an error to put a parameter reference inthe xml document. But it is not an error to putan entity reference in DTD in defining thevalue of another entity. But the reference willnot be resolved until it is used in thedocument.

  • 8/14/2019 2 XML Basics

    47/50

    Entity declaration

    DTD

    entity>

    %ParEntity;

    IN XML:

    &GenEntity;

  • 8/14/2019 2 XML Basics

    48/50

    Another classification

    Parsed Entities: well-formed content which isparsed

    Unparsed Entities: non-XML data

    Unparsed entities depend on the notationdeclaration to identify them so that the

    application processing the XML documentknows what kind of entity is being used andwhat to do with it.

  • 8/14/2019 2 XML Basics

    49/50

    Example: Parsed Entity

    In DTD

    reserved>

    &rights; >

    In XML:

    &book;

  • 8/14/2019 2 XML Basics

    50/50

    Example: Unparsed Entity

    In DTD:

    In XML: