3e8.org

In EBCDIC we trust.

August 1, 2010

Default namespaces in SXML

The stock SXML->XML serializer from sxml-tools has a couple aesthetic issues related to namespace output.

  1. It doesn't support default namespaces at all, which makes the already-verbose XML positively prolix, and could cause some non-conformant XML processors to fail.
  2. It does not allow redeclarations of XML prefixes, so you may sometimes get an autogenerated prefix name even when you provided a mapping. (This was a "design goal," though.)
  3. It does not support declaring all prefixes in the root element, which in certain cases can elevate the natural redundancy of XML to dizzying heights.

I added support for 1. and 2. in version 0.2 of the sxml-serializer egg, released yesterday. No.3 unfortunately will take some time to think about, as the code is geared to declare prefixes as locally as possible.

So I was going to write a blog post with a lot of examples and in-depth explanation, but instead, I just documented the egg! See The default namespace and Redeclaring XML prefixes for more details.

Here's a preview, though. The change introduces a new *default* pseudo-namespace which we can use to map any number of URIs to the default namespace. This works with nested elements and also handles the empty namespace correctly. Below is a pretend Atom document that is rendered without any prefixes:

> (serialize-sxml
    '(*TOP* (@ (*NAMESPACES*
                (atom "http://www.w3.org/2005/Atom")
                (xhtml "http://www.w3.org/1999/xhtml")))
       (atom:feed (atom:entry
                   (atom:content (@ (type "xhtml"))
                    (xhtml:div (xhtml:p "I'm invincible!"))))))
    ns-prefixes: '((*default* . "http://www.w3.org/2005/Atom")
                   (*default* . "http://www.w3.org/1999/xhtml")))

<feed xmlns="http://www.w3.org/2005/Atom">
  <entry>
    <content type="xhtml">
      <div xmlns="http://www.w3.org/1999/xhtml">
        <p>I'm invincible!</p>
      </div>
    </content>
  </entry>
</feed>

Instead, if we omit the *default* mappings from ns-prefixes -- or just use the stock serializer -- every element is prefixed:

<atom:feed xmlns:atom="http://www.w3.org/2005/Atom">
  <atom:entry>
    <atom:content type="xhtml">
      <xhtml:div xmlns:xhtml="http://www.w3.org/1999/xhtml">
        <xhtml:p>I'm invincible!</xhtml:p>
      </xhtml:div>
    </atom:content>
  </atom:entry>
</atom:feed>

I like to call this "terse mode," because irony is the spice of life.