The best kittens, technology, and video games blog in the world.

Saturday, August 05, 2006

W3C's XML Query Use Cases ported to magic/xml

Photo from flickr, by annia316, CC-BY.
W3C's XQuery Use Cases list typical usage scenarios for XQuery. As the plan behind magic/xml is to beat every other XML processing system in terms of expressiveness, I started recoding these use cases in magic/xml. So far the first three use cases (23 queries) got recoded. A few examples follow. Complete sources are packed together with magic/xml library.

Query XMP 1: List books published by Addison-Wesley after 1991, including their year and title.
In XQuery:
for $b in doc("")/bib/book
where $b/publisher = "Addison-Wesley" and $b/@year > 1991
<book year="{ $b/@year }">
{ $b/title }
In magic/xml:
XML.bib! {
XML.from_file('bib.xml').children(:book) {|b|
if b.child(:publisher).text == "Addison-Wesley" and b[:year].to_i > 1991
book!({:year => b[:year]}, b.children(:title))
Query TREE 1: Prepare a (nested) table of contents for Book1, listing all the sections and their titles. Preserve the original attributes of each
element, if any.
In XQuery:
declare function local:toc($book-or-section as element()) as element()*
for $section in $book-or-section/section
{ $section/@* , $section/title , local:toc($section) }

for $s in doc("book.xml")/book return local:toc($s)
In magic/xml:
def local_toc(node)
XML.section(c.attrs, c.child(:title), local_toc(c))

XML.toc! {
add! local_toc(XML.from_file('book.xml'))
Query SEQ 1: In the Procedure section of Report1, what Instruments were used in the second Incision?
for $s in doc("report1.xml")//section[section.title = "Procedure"]
return ($s//incision)[2]/instrument
XML.from_file('report1.xml').descendants(:section) {|s|
next unless s.child(:"section.title").text == "Procedure"
print s.descendants(:incision)[1].child(:instrument)
magic/xml solutions are on average 10% longer than XQuery. That's a great result, because:
  • I don't think any other XML processing library gets results even close to that. So far only special-purpose XML processing languages like XQuery or CDuce were able to be that expressive.
  • 10% more characters is very little in exchange for full power of Ruby. And you don't have to learn a new language, working with magic/xml is almost like working with plain Ruby.
  • The benchmarks come from XQuery guys and show what is good at, so in a more unbiased selection magic/xml would probably do better than that. For example magic/xml assumes real XML attributes (<a href="">Google</a>) are the common case, not dummy tags with text (<a><href><content>Google</content></a>) , while XQuery assumes it the other way around. Then maybe magic/xml should get a few convenience methods for the other case.
  • If special XML processing languages aren't significantly more expressive, do we even need them ? Performance comes from good profilers, but expressiveness cannot be hacked onto the system afterwards. So even if magic/xml is slow sometimes, it doesn't really matter.
Anyway, enjoy :-) Official beta release is coming soon.

No comments: