Web-based Programming Lanjut Pertemuan 9 Matakuliah: M0492 / Web-based Programming Lanjut Tahun:...
-
Upload
aileen-williams -
Category
Documents
-
view
217 -
download
0
Transcript of Web-based Programming Lanjut Pertemuan 9 Matakuliah: M0492 / Web-based Programming Lanjut Tahun:...
Web-based Programming Lanjut Pertemuan 9
Matakuliah : M0492 / Web-based Programming Lanjut Tahun : 2007
Bina Nusantara
Extensible Markup Language (XML)• XML• XML vs HTML• Tag and elements• Schemas and DTDs (Document Type Definition)• Namespace• Document Object Model (DOM)
Bina Nusantara
eXtensible Markup Language (XML)• Marking up the document is the process of
identifying certain areas of a document as having a special meaning.
• A Markup language is just a set of rules that define how we add meaning to areas of a document.
Bina Nusantara
XML vs HTML• XML is designed to describe the structure of text, not how it should be displayed.
• HTML is designed to describe how the text should be displayed.
<BODY>
Here we have some text
<H1>This is a heading</H1>
This bit is normal text
<B>This is some bold text</B>
And finally some more normal text
</BODY>
as HTML
as XML
Bina Nusantara
XML vs HTML• In XML, the tags can be anything you like, and it is only how
we use them that give them a meaning.• XML is fairly readable, so using tag names that describe the
contents is common sense.
<Authors><Author>
<au_id>172-32-1176</au_id><au_lname>White</au_lname><au_fname>Johnson</au_fname>
</Author><Author>
<au_id>213-46-8915</au_id><au_lname>Green</au_lname><au_fname>Marjorie</au_fname>
</Author></Authors>
Bina Nusantara
Tags and Elements• If using XML to describe data, then it’s possible that some fields
might contain no data. In this case the tags would be empty.Empty tags can be defined in:– With start and end tag
<TagName></TagName>– just an opening tag, but with a slash at the end
<TagName/>
• Tags in XML are case sensitive, so the opening and closing tags must match in case.invalid XML tags: <TagName></TagName>
Bina Nusantara
Tags and Elements• Root Tags
This is defined as the outer tag, and an XML document can have only one root.
<Authors> <Author>
<au_id>172-32-1176</au_id><au_lname>White</au_lname><au_fname>Johnson</au_fname>
</Author> <Author>
<au_id>213-46-8915</au_id><au_lname>Green</au_lname><au_fname>Marjorie</au_fname>
</Author></Authors>
<Authors> <Author>
<au_id>172-32-1176</au_id><au_lname>White</au_lname><au_fname>Johnson</au_fname>
</Author></Authors><Authors> <Author>
<au_id>213-46-8915</au_id><au_lname>Green</au_lname><au_fname>Marjorie</au_fname>
</Author></Authors>
valid tags
Invalid tags
Bina Nusantara
Tags and Elements• The <?xml> Tag
– not a true XML tag, but a special tag indicating special processing instructions.
– This should be the first line of each XML document, can be used to identify version and language information.
– This tag is also the place where you can define the language used in the XML data.
– This is important if your data contains character that aren’t part of the standard English ASCII character set.
<?xml version=“1.0”?>
<?xml version=“1.0” encoding=“iso-8859-1”?>
Bina Nusantara
Tags and ElementsLanguage Character Set
Unicode (8 bit) UTF-8
Latin 1 (Western Europe, Latin America) ISO-8859-1
Latin 2 (Central/Eastern Europe) ISO-8859-2
Latin 3 (SE Europe) ISO-8859-3
Latin 4 (Scandinavia/Baltic) ISO-8859-4
Latin/Cyrillic ISO-8859-5
Latin/Arabic ISO-8859-6
Latin/Greek ISO-8859-7
Latin/Hebrew ISO-8859-8
Latin/Turkish ISO-8859-9
Latin/Lappish/Nordic/Eskimo ISO-8859-10
Japanese EUC-JP or Shift_JIS
Bina Nusantara
Tags and Elements• Attribute
XML attribute must be enclosed in quotes.
• Special CharacterSpecial set of character cannot be used in normal XML strings.
BOOK ISBN=“1-861002-61-0”>Professional Active Server Pages 3.0</BOOK>
Character Must be replaced by
& &
< <
> >
“ "
‘ '
Bina Nusantara
Schemas and DTDs• Schema and DTDs are the flip side of the same coin.
• Both specify which elements are allowed in a document, and can turn a well-formed XML document into a valid XML document.
• DTD is a text file that defines the structure of an XML document. But DTD isn’t itself XML – it has a completely separate syntax.
• Schema is the structure that defines DTS should be XML too.
Bina Nusantara
Schemas and DTDs• This one is for the authors XML document, as generated from the
pubs database.
<!ELEMENT DOCUMENT (AUTHOR+)><!ELEMENT AUTHOR(au_id, au_lname, au_fname, phone, address, city, state, zip, contract)><!ELEMENT au_id (CDATA)><!ELEMENT au_lname (CDATA)><!ELEMENT au_fname (CDATA)><!ELEMENT phone (CDATA)><!ELEMENT address (CDATA)><!ELEMENT city (CDATA)><!ELEMENT state (CDATA)><!ELEMENT zip (CDATA)><!ELEMENT contract (CDATA)>
The + sign says ‘one or more’.
Each AUTHOR element is made up from nine other elements. Each of this sub-elements contains character data (CDATA)
Bina Nusantara
Schemas and DTDs• Two real flaws with DTDs:
– they aren’t XML– You cannot specify the data types – such as integer,
date, and so on – for each element. CDATA simply means that an element contains just character data, and doesn’t identify the actual type of the element’s contents.
Bina Nusantara
Schemas and DTDs• If we convert the DTD into a schema it would be something
like:
<Schema ID=“AUTHOR”><Element name=“au_id“/><Element name=“au_lname“/><Element name=“au_fname“/><Element name=“phone“/><Element name=“address“/><Element name=“city“/><Element name=“state“/><Element name=“zip“/><Element name=“contract“/>
</Schema>
Bina Nusantara
Schemas and DTDsWith the addition of data types we’d get:<Schema ID=“AUTHOR”>
<Element name=“au_id“ type=“string”/><Element name=“au_lname“ type=“string”/><Element name=“au_fname“ type=“string”/><Element name=“phone“ type=“string”/><Element name=“address“ type=“string”/><Element name=“city“ type=“string”/><Element name=“state“type=“string”/><Element name=“zip“type=“string”/><Element name=“contract“ type=“Boolean”/>
</Schema>
This schema now details not only the allowable elements, but also their data types.
Bina Nusantara
Namespaces• One problem with XML is that you can give an element almost any name you want. – There’s quite a good chance that you’ll pick the same name as
someone else– Or even use the same name to mean different things in
different XML documents.
<contract>Yes</contract>This is taken from the authors table in pubs, and indicates the author is a contracted author.
<contract>F:/contacts/1999.doc</contract>This contract element identifies the document that contains the contract
This isn’t a problem while these documents stay separated. But if you combine the two documents, use the namespace to identify to which document the contract element belongs to.
Bina Nusantara
Namespaces• Namespaces are added to XML document by defining the xmlns (XML Name Space) attribute in the root tag, which requires a Uniform Resource Identifier (URI).
• URI is simply a name that can uniquely identify the namespace.
<Authors xmlns:pubs=“http://wrox.co.uk/ms/PubsDB” xmlns:wrox=“http://wrox.co.uk/authors”> <Author>
<au_id>172-32-1176</au_id><au_lname>White</au_lname><au_fname>Johnson</au_fname><pubs:contract>Yes</pubs:contract><wrox:contract>F:/contacts/Johnson1999.doc</wrox:contract>
</Author></Authors>
Bina Nusantara
Document Object Model (DOM)• The Document Object Model is an API for HTML and XML documents, and the way they can be accessed.
• DOM defines a standard way in which we can access and manipulate the XML structure.
<Authors><Author>
<au_id>172-32-1176</au_id><au_lname>White</au_lname><au_fname>Johnson</au_fname>
</Author><Author>
<au_id>213-46-8915</au_id><au_lname>Green</au_lname><au_fname>Marjorie</au_fname>
</Author></Authors>
The XML documents are hierarchical by nature – always have a top-level, or root element, and then child elements.
Bina Nusantara
Document Object Model (DOM)The previous document could be represented as:
Authors
AuthorAuthor
au_id au_idau_lname au_lnameau_fname au_fname
In DOM terms, these elements are also nodes.
A node just represent a generic element in this tree-type structure.
Bina Nusantara
Document Object Model (DOM)• Base Objects
To represent this hierarchical nature, the DOM provides a whole set of objects, methods, and properties that allow us to manipulate the DOM.
Object Description
Node A single node in the hierarchical
NodeList A collection of nodes
NamedNodeMap A collection of nodes allowing access by name as well as index
Bina Nusantara
Document Object Model (DOM)Property Description
ChildNodes Returns a NodeList containing the children of the node.
firstChild Returns the first child of the current node.
lastChild Returns the last child of the current node.
parentNode Returns the parent node of the current node.
previousSibling Returns the previous sibling, i.e., the previous node at the same level in the hierarchy.
nextSibling Returns the next sibling, i.e., the next node at the same level in the hierarchy.
nodeName The name of the node
nodeValue The value of the node.
Bina Nusantara
Document Object Model (DOM)Authors
Author
au_id au_lname au_fname
childNodes
childNodes
parentNode
parentNodefirstChild
firstChild
previousSibling previousSibling
nextSibling nextSibling
parentNode
parentNode
lastChild
Code Points to
nodRoot.childNodes(0) Author
nodRoot.childNodes(0).firstChild au_id
nodRoot.childNodes(0).firstChild.nextSibling au_lname
nodRoot.childNodes(0).firstChild.parentNode Author
nodRoot.childNodes(0).firstChild.nextSibling.parentNode Author
So, let’s assume we have a node object called nodRoot pointing to Authors:
Bina Nusantara
Document Object Model (DOM)• Specific DOM Objects
Object Description
Document The root object for an XML document
DocumentType Information about the DTD or schema associated with the XML document. Equivalent to !DOCTYPE in a DTD.
DocumentFragment A lightweight copy of the Document, useful for temporary storage or document insertions.
Element An XML element
Attribute or Attr An XML attribute
Entity A parsed or unparsed entity. Equivalent to !ENTITY in a DTD.
EntityReference An entity reference
Notation A notation. Equivalent to !NOTATION in a DTD.
CharacterData The base object for text information in a document
CDATASection Unparsed character data. Equivalent to !CDATA in a DTD
Text The text contents of an element or attribute node
Comment An XML comment element
ProcessingInstruction A processing instruction, as held in the <? ?> section
Implementation Application specific implementation details.
Bina Nusantara
1. <HTML><HEAD><TITLE>DOMExample.html</TITLE></HEAD>2. <BODY>3. <SCRIPT LANGUAGE="JavaScript">4. var i;5. var curNode; 6. var xmlDocument = new ActiveXObject("Microsoft.XMLDOM");
7. xmlDocument.load("authors.xml");8. var element = xmlDocument.documentElement;
9. document.writeln("The root node of the document is: <B>" + element.nodeName + "</B>");10. document.writeln("<BR>Its child elements are:");11. for(i = 0; i < element.childNodes.length; i++) {12. var curNode = element.childNodes.item(i);13. document.writeln("<LI><B>" + curNode.nodeName + "</B></LI>");}
14. var currentNode = element.firstChild;
15. document.writeln("<BR><BR>The first child of the root node is: <B>" + currentNode.nodeName + "</B>");16. document.writeln("<BR>Its child elements are:");17. for(i = 0; i < currentNode.childNodes.length; i++) {18. curNode = currentNode.childNodes.item(i);19. document.writeln(" <B><LI> " + curNode.nodeName + " - value : " +
curNode.firstChild.nodeValue +"</B></LI>"); }
20. var nextSib = currentNode.nextSibling;
21. document.writeln("<BR><BR>The next sibling is: <B>" + nextSib.nodeName + "</B>");22. document.writeln("<BR>Its child elements are:");23. for(i = 0; i < nextSib.childNodes.length; i++) {24. curNode = nextSib.childNodes.item(i);25. document.writeln(" <B><LI> " + curNode.nodeName +" - value : " +
curNode.firstChild.nodeValue +"</B></LI>"); }
26. </SCRIPT></BODY></HTML>
Bina Nusantara
Document Object Model (DOM)• Traversing the DOM– XML is a relatively new area. many browsers don’t
support the handling of XML data.– IE 5.0 or above is really the browser that good support
for it.– You have to decide whether you want to send the XML
data to the browser, or you want to process the XML in ASP pages and send pure HTML up to the browser. This depends upon your target audience.
Bina Nusantara
1. <HTML><HEAD>2. <TITLE>TraverseXML.html</TITLE>3. <SCRIPT LANGUAGE="JScript">4. var g_strNodeTypes = new Array ('','ELEMENT (1)','ATTIBUTE (2)','TEXT (3)','CDATA SECTION (4)',5. 'ENTITY REFERENCE (5)', 'ENTITY (6)', 'PROCESSING INSTUCTION (7)',6. 'COMMENT (8)', 'DOCUMENT (9)', 'DOCUMENT TYPE (10)', 7. ‘DOCUMENT FRAGMENT (11)', 'NOTATION (12)');
8. function showChildNodes(nodNode, intLevel)9. {10. var strNodes = ''; // string containing the nodes information11. var intCount = 0; // count of the nodes12. var intNode = 0; // current node number13. var nodAttrList; // node list of the attributes for a node
14. // Get the values for this node15. for (var i=0; i<intLevel; i++)16. strNodes = strNodes + ' ';17. strNodes += '<B>' + nodNode.nodeName + '</B> Type: <B>' +
g_strNodeTypes[nodNode.nodeType] +'</B> Value: <B>' + nodNode.nodeValue 18. + '</B><BR>';
19. // check there are some sttribute20. nodAttrList = nodNode.attributes;21. if(nodAttrList != null)22. {23. intCount = nodAttrList.length;24. if(intCount>0)25. {26. // for each attribute, display the attribute information27. for (intAttr=0; intAttr<intCount; intAttr++)28. strNodes += '<B>' + nodAttrList(intAttr).nodeName + '</B> Type: <B>' +
g_strNodeTypes[nodAttrList(intAttr).nodeType] + '</B> Value: <B>' + nodAttrList(intAttr).nodeValue + '</B><BR>';
29. }30. }
Bina Nusantara
31. //check for any child nodes32. intCount =nodNode.childNodes.length;33. if (intCount > 0)34. //for each child node, display the node, attributes, and its child node information35. for(intNode = 0; intNode <intCount; intNode++ )36. strNodes += showChildNodes(nodNode.childNodes(intNode), intLevel + 1);
37. return strNodes;38. }
39. function start_traverse()40. {41. var domXMLData = new ActiveXObject("Microsoft.XMLDOM");42. domXMLData = dsoData;43. txtData.innerHTML = showChildNodes(domXMLData,0);44. }45. </SCRIPT>46. </HEAD>
47. <BODY onload = "start_traverse()">48. <H1>Traversing the Nodes in an XML Document<HR></H2>49. <XML ID="dsoData" SRC="authors.xml"></XML>50. <SPAN ID="txtData"></SPAN>51. </BODY>52. </HTML>