I've switched this blog to explicitly serve everything in UTF-8 encoding. Before I didn't think much about it and confined myself to ASCII and hoped for the best. Now I can just type "François Rabelais or Björk Guðmundsdóttir or 艾未未" directly and not HTML entity escape it. If you notice any errors, please let me know.

I was surprised to learn there are at least four ways for a web page to declare their encoding. (Or charset, the terms are ambiguous.) Annoyingly the web server's HTTP header declaration overrides whatever encoding the document itself declares via a <meta> tag. I think that was back in the fantasy world where servers would negotiate content types and transcode on the fly. These days unless you're writing in Chinese or are a super-duper expert you should always be using UTF-8.

Unrelated, thanks to Aristotle for finding a bug in the updated tags on my Atom feed. I'd hacked some Perl code and forgot how Perl scoping works.

  2012-01-25 19:38 Z