Dive Into Greasemonkey

Teaching an old web new tricks

4.23. Parsing XML

Firefox automatically parses the current page into a DOM, but you can also manually create a DOM out of any XML string, either one you constructed yourself, or XML that you retrieved from a remote location.

Example: Parse an arbitrary string as XML

var xmlString = '<passwd>' + 
'  <user id="101">' +
'    <login>mark</login>' + 
'    <group id="100"/>' +
'    <displayname>Mark Pilgrim</displayname>' + 
'    <homedir>/home/mark/</homedir>' +
'    <shell>/bin/bash</shell>' +
'  </user>' +
'</passwd>'
var parser = new DOMParser();
var xmlDoc = parser.parseFromString(xmlString, "application/xml");

The key here is the DOMParser object, which contains the parseFromString method. (It contains other methods too, but they are not useful to us here.) Its parseFromString method takes two arguments: the XML string to parse, and the content type. Both arguments are required.

[Note]

The DOMParser's parseFromString method takes a content type as its second parameter. The method accepts application/xml, application/xhtml+xml, and text/xml. For reasons that are too ridiculous to go into, you should always use application/xml.

This pattern is especially powerful when you combine it with the GM_xmlhttpRequest function to parse XML from a remote source.

Example: Parse XML from a remote source

GM_xmlhttpRequest({
    method: 'GET',
    url: 'http://greaseblog.blogspot.com/atom.xml',
    headers: {
        'User-agent': 'Mozilla/4.0 (compatible) Greasemonkey/0.3',
        'Accept': 'application/atom+xml,application/xml,text/xml',
    },
    onload: function(responseDetails) {
        var parser = new DOMParser();
        var dom = parser.parseFromString(responseDetails.responseText,
            "application/xml");
        var entries = dom.getElementsByTagName('entry');
        var title;
        for (var i = 0; i < entries.length; i++) {
            title = entries[i].getElementsByTagName('title')[0].textContent;
            alert(title);
        }
    }
});

This code will load the Atom feed from http://greaseblog.blogspot.com/atom.xml, parse it into a DOM, and query the DOM to get the list of entries. For each entry, it queries the DOM again to get the entry title, then displays it in a dialog box.

← Overriding a built-in Javascript method
Case Studies →