XML is bad for data transfer for lots of reasons. It's slow to parse, awfully wordy, doesn't support 8 bit data, etc etc. But my favourite thing is how most XML consumers can be induced to open arbitrary URLs.

There's a 4+ year old security hole in many XML parsers called XXE, the Xml eXternal Entity attack. Take a look:

<?xml version="1.0" encoding="UTF-8"?>
<!DOCTYPE doc [
  <!ENTITY passwd SYSTEM "file:///etc/passwd">
  <!ENTITY http SYSTEM "http://example.com/delAll">
  &passwd; &http;
Don't let all the pointy-parentheses confuse you (although they're a good reason to hate XML too). What's going on there is my document has defined two new entities called passwd and http and then used them. They're defined as expanding to the contents of URIs.

About 3/4 of the XML applications I've encountered out there will blindly do as ordered and load the URIs. The app will load the password file, at which point a clever hacker can usually induce the application to send it back to them (as an error message, for instance). Even better it will load the HTTP URL. Yes, many XML applications will load any URL you tell them to. From the app server. Nice, huh?

XXE is an old bug, but it keeps coming back because most people using XML would never think their little XML parser can be instructed to start opening network sockets. Acrobat had this bug just last year. XML parsers usually have obscure options to secure them, but many have them off by default. Why are we using a data format where this is possible at all?

  2006-08-17 00:48 Z