XPath Introduction
-
Upload
stuart-myles -
Category
Technology
-
view
1.607 -
download
5
description
Transcript of XPath Introduction
XPath: An Introduction
Stuart Myles
Objectives
• What is XPath?• An introduction to the XPath 1.0 language– XML refresher– XPath basics– What else can you do with XPath 1.0?
• Where to go for more information
XPathXML Path Language
• Path notation with slashes
newsItem/rightsInfo/copyrightHolder
recipe/ingredientList/ingredient
• Like UNIX directory paths or URLS
What is XPath?
• Syntax for defining parts of an XML document– Locate elements or attributes
• Performing operations over data– XPath contains a library of standard functions– Numeric, string, boolean
• A major part of several XML standards– XSLT, XQuery, XML Schema, Schematron
XPath Introduction:XML Refresher
• XML documents contain one or more elements, delimited by start and end tags
<foo></foo>
• Elements can be nested to any depth<foo>
<bar></bar></foo>
XML Attributes and Text Content
• Elements can have attributes<foo lang=“fr”>
<bar id=“theOne” lang=“en”></bar></foo>
• Elements can have text content<foo lang=“fr”>
<bar lang=“en”>theOne</bar></foo>
• Empty elements have no children or text<foo></foo>
• A shorthand for writing empty elements<foo />
XML Namespaces
• Elements can be defined in different namespaces• Namespaces look like URLs• You can use xmlns to declare a default namespace
<newsItem xmlns='http://iptc.org/std/nar/2006-10-01/'> <itemMeta> <title>Pope Blesses Astronauts</title> </itemMeta></newsItem>
• newsItem is in the http://iptc.org/std/nar/2006-10-01/ namespace• itemMeta and title are also in the http://iptc.org/std/nar/2006-10-01/ ns
• Child elements inherit from their parents
XML Namespace Prefixes
• You can use xmlns:prefix to declare a namespace and bind it to a prefix<nar:newsItem xmlns:nar='http://iptc.org/std/nar/2006-10-01/'> <nar:itemMeta> <nar:title>Pope Blesses Astronauts</nar:title> </nar:itemMeta></nar:newsItem>
• newsItem is in the http://iptc.org/std/nar/2006-10-01/ namespace• itemMeta and title are also in the http://iptc.org/std/nar/2006-10-01/
namespace
• To an XML parser, this document and the previous one are identical
XPath Crash CourseThe Basics: Selecting Elements
• The simplest XPath form:– one or more tag names, separated by slashes (/)newsItem/itemMeta/title <- title under itemMeta
• Use * instead of a tag name to match anythingnewsItem/*/title <- title grandchildren of newsItem
• An empty tag searches all levels of the tree //title Every title element in the doc
newsItem//title Every title under newsItem
XPath: Using Attributes
• Attribute values are indicated by @@rel <- The rel attribute of the current element
• Element and Attribute values are tied by /@link/@rel <- The rel attribute of the link element
• Use [] for conditional selectionslink[@rel] <- link element with a rel attributelink[@rel = “parent”]link[@size < “1000”]link[not(@href)]
XPath and Namespaces
• XPath supports namespaces
nitf:p <- The p element from the nitf namespacexhtml:p <- The p element from the xhtml nsnar:* <- Any element from the nar namespace@atom:* <- Any attribute from the atom ns
Protip: if you can’t figure out why your XPathexpression isn’t matching, check the namespace
What Else Can XPath Do?Numeric, String, Boolean Functions
Publication/FilingMetadata[1]Publication/FilingMetadata[last()]Publication/FilingMetadata[last() - 1]FilingMetadata[position() mod 2 = 0]
FilingMetadata[Category = “q” or Category = “j”]
not(contains(SlugLine, “advisory”))starts-with(FilingOnlineCode, “1”)
And XPath 2.0 adds even more functions, including regular expressions
More XPath Information
List of examples:http://msdn.microsoft.com/en-us/library/ms256086.aspxIntroductory, interactive tutorial:http://www.zvon.org/comp/r/tut-XPath_1.htmlMore advanced tutorial:http://www.ibm.com/developerworks/xml/tutorials/x-xpath/section2.html XPath chapter from XML in a Nutshell:http://oreilly.com/catalog/xmlnut/chapter/ch09.html