As a developer you’re probably aware that your web browser natively implements XSL transforms (e.g. when you browse to an XML file with an embedded stylesheet directive). But perhaps you didn’t know that your browser supports XPath 1.0. In this post we’ll explore this XPath capability in a small XHTML page that renders like this:
The form allows you to enter text for an XPath expression, an optional set of XML prefix/namespace pairs, and an XML document in successive textareas. The final textarea displays the results of evaluating the specified XPath expression against the specified XML when you click the “Evaluate” button.
Here’s how it’s done:
<?xml version="1.0" encoding="UTF-8"?>
<!DOCTYPE html PUBLIC "-//W3C//DTD XHTML 1.0 Strict//EN" "http://www.w3.org/TR/xhtml1/DTD/xhtml1-strict.dtd">
<html xmlns="http://www.w3.org/1999/xhtml" xml:lang="en" lang="en">
<head>
<title>XPath in the browser</title>
<meta http-equiv="Content-Type" content="text/html; charset=utf-8" />
<style type="text/css" media="all">
/*<![CDATA[*/
body {
padding: 0 10px;
background: #ccc;
font-family: verdana,helvetica,sans-serif;
}
label {
display: block;
float: left;
clear: both;
width: 150px;
text-align: right;
padding: 0 10px;
font-size: 150%;
}
textarea {
float: left;
overflow: auto;
}
/*]]>*/
</style>
<script type="text/javascript">
//<![CDATA[
// turn namespace text into a prefix-keyed map
var parseNs = function(s) {
var map = {};
// whitespace separated prefix=uri pairs
var chunks = s.match(/\S+/g);
for (var i in chunks) {
var ns = chunks[i].match(/(\S+)=(\S+)/);
map[ns[1]] = ns[2];
}
return map;
};
// create namespace resolver from the above prefix-keyed map
var createNsResolver = function(map) {
var nsMap = map;
return function(prefix) {
return nsMap[prefix] || null;
}
};
// wire stuff up once the page is fully loaded
window.onload = function(ev) {
// get ui widgets
var xpath = document.getElementById("txp");
var button = document.getElementById("button");
var ns = document.getElementById("tns");
var tin = document.getElementById("tin");
var tout = document.getElementById("tout");
button.onclick = function(ev) {
try {
var parser = new DOMParser();
// parse input xml into dom tree
var node = parser.parseFromString(tin.value, "text/xml");
// feeble attempt to check for errors
for (var i = 0; i < node.childNodes.length && i < 2; i++) {
if (node.childNodes[i].nodeName.toLowerCase() === "parsererror") {
alert("invalid xml");
return;
}
}
// create namespace resolver from appropriate text
var resolver = createNsResolver(parseNs(ns.value));
// evaluate the xpath expression
var result = node.evaluate(xpath.value, node, resolver, XPathResult.ANY_TYPE, null);
// display results based on type
if (result.resultType === XPathResult.UNORDERED_NODE_ITERATOR_TYPE) {
var s = "";
for (var item; item = result.iterateNext(); ) {
if (item.firstChild && item.firstChild.nodeValue) {
s += item.firstChild.nodeValue + "\n";
}
}
tout.value = s;
}
else if (result.resultType === XPathResult.NUMBER_TYPE) {
tout.value = result.numberValue.toString();
}
else if (result.resultType === XPathResult.STRING_TYPE) {
tout.value = result.stringValue;
}
else if (result.resultType === XPathResult.BOOLEAN_TYPE) {
tout.value = result.booleanValue.toString();
}
}
catch (error) {
alert(error);
}
};
}
//]]>
</script>
</head>
<body>
<form>
<input id="button" type="button" value="Evaluate" />
<label>XPath</label>
<textarea id="txp" cols="100" rows="3" wrap="off"></textarea>
<label>Namespaces</label>
<textarea id="tns" cols="100" rows="4" wrap="off"></textarea>
<label>Input XML</label>
<textarea id="tin" cols="100" rows="20" wrap="off"></textarea>
<label>Output</label>
<textarea id="tout" cols="100" rows="10" wrap="off"></textarea>
</form>
</body>
</html>
The logic begins in line 53 when the javascript window.onload() event handler is called. References to the UI widgets are established and then a click handler is assigned to the “Evaluate” button. This event handler beginning in line 61 is the heart of the application.
Execution begins by instantiating and running a DOMParser in lines 63-66 to turn the supplied XML text into a DOM tree.
Lines 68-74 attempt to check for XML parse errors. I found behavior here to vary across browsers so this safety net is definitely in need of improvement. Consult the documentation at Mozilla for starters.
Line 77 creates an XML namespace resolver. XML namespaces are used to distinguish related sets of elements and are identified with a URI. In practice XML writers use short textual prefixes as shorthand for these namespace identifiers. The mapping from prefixes to URI identifiers is performed with attributes inside XML elements. Our application allows the user to specify namespace mappings by typing whitespace separated prefix=URI pairs in the "Namespaces" textarea. The namespace resolver codifies the association between prefixes and URIs. More on this in a moment.
Line 80 invokes the XPath evaluation using the node.evaluate() method. This method takes an XPath expression, a DOM tree, a optional namespace resolver, a result type specifier, and an optional pre-existing XPathResult as arguments, respectively.
Lines 83-100 translate the variegated XPathResult into a textual representation which is inserted into the bottom textarea. If there are multiple matching nodes in the result set they will be handled in the loop from lines 85 to 89.
Our very first example demonstrated the handling of plain XML. Namespaced XML introduces a few complications. For one thing you need to specify all namespaces explicitly in your XPath expression and they must refer to the prefixes defined in the "Namespaces" textarea. The following example illustrates this practice:
If you wanted to extract the heading text of this page you would need to write:
//xhtml:h1
not
//h1
Feel free to explore the range of supported XPath. Here are a few suggestions relevant to the 2nd example:
- //xhtml:title
- //xhtml:title | //xhtml:h1
- //svg:*[@cx > 200 and @r = 50]/@stroke
- count(//svg:circle[@fill = 'green'])
- string-length(“hello world!”)




