The best kittens, technology, and video games blog in the world.

Monday, July 31, 2006

XML-oriented programming with Ruby

Photo from Commons by Bfe, GFDL.
XML programming is important. There are even special languages that make it easy. But do we need a new language or can we have the same level of expressiveness in an existing one, like Ruby ?

Let's try to program XML - but not starting from implementation and bending our mental model until it fits, but starting from a mental model and bending the language ! And let's care about the real thing first, DTD, entities, validation and whatnot are less common so they can wait.

The basic structure should of course be an XML node object:
a_node =, {:href => ""}, "Google")

What do we want the constructor to accept ? Well, the first argument must be a Symbol (or a String) with XML tag name. Then the attributes in a hash table. Then the contents - random collection of strings and XML nodes. Notice that we can easily make attribute hash optional, if the second attribute is not a hash it's simply part of the content.

a =
b =, {:style => "clear: right"})
c =, "My new blog")
d =, c, "Hello, ",, {:href=>""}, "world"), "!")

Of course our XML object will have the appropriate to_s method, so we can simply say print xml_node to get the correctly formatted XML.

XML should also support XML.parse(xml_as_string), XML.from_file(file_name_or_handler) and XML.from_url(url) methods for fetching existing XML files.

We probably also want to be able to easily generate subclasses of XML:
H3 = XML.make_subclass(:h3)
a ="Hello")

The subclasses factory may even get much complex, but let's stay with the easy one for now.

We would like our XML nodes to support the "natural" operations like:
a_node[:href] = ""
body_node <<"New section")
node.each{|c| print c}
node ={|c| if == :h3 then, c.contents) else c end}

Because Ruby 1.8 doesn't support arbitrary *-expansion, we probably also want to automatically expand Arrays passed as constructor arguments - XML nodes cannot have arrays as elements anyway.

The we will be able to say (compare with analogous program in CDuce):
require 'magic/xml'
include XML::XHTML # Import all XHTML subclasses

posts = XML.from_url ""

def format_posts(posts){|post|{:href=>post[:href],}, post[:description])) }

print" summary magically generated by Ruby")
),"The list"),
Yay, isn't it cute ? Uhm, now it only needs to be implemented :-D

No comments: