Let's say, for example, that you wanted to find all the links on a page. You might consider using document.getElementsByTagName('a')
, but then you'll still need to check each element to see if it has an href
attribute, because the <a>
element can also be used for named anchors.
Instead, use Firefox's built-in XPath support to find all the <a>
elements that have an href
attribute.
Example: Find all the links on a page
var allLinks, thisLink; allLinks = document.evaluate( '//a[@href]', document, null, XPathResult.UNORDERED_NODE_SNAPSHOT_TYPE, null); for (var i = 0; i < allLinks.snapshotLength; i++) { thisLink = allLinks.snapshotItem(i); // do something with thisLink }
The document.evaluate
method is the key here. It takes an XPath query as a string, then a bunch of other parameters I'll explain in a minute.
This XPath query finds all the <a>
elements that have an href
attribute, and returns them in random order. (That is, the first one you get is not guaranteed to be the first link on the
page.) Then you access each found element with the allLinks.snapshotItem(i)
method.
XPath expressions can do wonderous things. Here's one that finds any element that has a title
attribute.
Example: Find all the elements with a title
attribute
var allElements, thisElement; allElements = document.evaluate( '//*[@title]', document, null, XPathResult.UNORDERED_NODE_SNAPSHOT_TYPE, null); for (var i = 0; i < allElements.snapshotLength; i++) { thisElement = allElements.snapshotItem(i); switch (thisElement.nodeName.toUpperCase()) { case 'A': // this is a link, do something break; case 'IMG': // this is an image, do something else break; default: // do something with other kinds of HTML elements } }
Here's an XPath query that returns every <div>
with a specific class
.
Example: Find all <div>
s with a class
of sponsoredlink
var allDivs, thisDiv; allDivs = document.evaluate( "//div[@class='sponsoredlink']", document, null, XPathResult.UNORDERED_NODE_SNAPSHOT_TYPE, null); for (var i = 0; i < allDivs.snapshotLength; i++) { thisDiv = allDivs.snapshotItem(i); // do something with thisDiv }
Note that I used double quotes around the XPath query string, so that I could use single quotes within it.
There are lots of variations of the document.evaluate
method. The second parameter (document
in both of the previous examples) can be any element, and the XPath query will only return nodes that are children of that
element. So if you already have a reference to an element (say, from document.getElementById
or a member of a document.getElementsByTagName
array), you can restrict the query to search only children of that element.
The third parameter is a reference to a namespace resolver function, which is only useful if you care about writing user scripts
that work on pages served with the application/xhtml+xml
media type. If you don't know what that means, don't worry about it; there aren't very many pages like that and you probably
won't run into one. Mozilla XPath documentation explains how to use it, if you really want to know.
The fourth parameter is how you want your results returned. The previous two examples both use XPathResult.UNORDERED_NODE_SNAPSHOT_TYPE
, which returns elements in random order. I use this 99% of the time, but if for some reason you wanted to make sure that
you got elements back in exactly the order in which they appear in the page, you can use XPathResult.ORDERED_NODE_SNAPSHOT_TYPE
instead. Mozilla XPath documentation gives examples of some other variations as well.
The fifth parameter can be used to merge the results of two XPath queries. Pass in the result of a previous call to document.evaluate
, and it will return the combined results of both queries. The previous two examples both use null
, meaning that we are only interested in the single XPath query defined in the first parameter.
Got all that? XPath can be as simple or as complicated as you like. I urge you to read this excellent XPath tutorial to learn more about XPath syntax. As for the other parameters to document.evaluate
, I rarely use them except as you've already seen here. In fact, you can define a function to encapsulate them.
Example: The xpath
function
function xpath(query) { return document.evaluate(query, document, null, XPathResult.UNORDERED_NODE_SNAPSHOT_TYPE, null); }
Now you can simply call xpath('//a[@href]')
to get all the links on the page, or xpath('//*[@title]')
to get all the elements with a title
attribute. You'll still need to use the snapshotItem
method to access each item in the results; it's not a regular Javascript array.