I've been reading a lot of REST vs SOAP falderall lately and it's getting tiresome. Well, some of it is interesting, like looking at whether Bloglines is REST. Anyway, I thought I'd point out the cowman and the farmer can be friends, at least when we both are discussing the smell of the fertilizer. So, four dumb things about XML as a wire format for distributed systems:
  • XML is text. You have to base-10 encode numbers in XML. This is terribly slow and inefficient.
  • XML can't handle binary data. There is no reasonable way to embed a 400k image into your XML. Your choices are to base 64 encode it (whee!) or use some wrapper around the XML like MIME.
  • XML is awfully wordy. I don't begrudge the meaningful beginning tag names and the pointy brackets, but the meaningful closing tag names are superfluous if all your XML is machine generated.
  • XML is complicated. We love XML because it's S-expressions, but it's a lot more too. Entities! PIs (whatever those are). Attributes vs. elements! Three different ways to describe the data model! Awful programming models! It's awfully complicated when what you're trying to do is pass a couple of numbers and a string.
The roots of XML are SGML, a hand-edited markup language for writing documents. I think it's clever that it's been repurposed for distributed systems, particularly since a human can easily read the packets without translation. And I really like the idea of document-oriented web services (whether SOAP or REST). But it'd sure be nice if the document were more friendly to computer data.
tech
  2004-09-30 15:20 Z