I need help. I'm building a REST-style web service. Its URLs specify operations on other URLs, so I need to pass URLs as parameters to my REST service. Let's say my service reverses the text of whatever URL is specified.
http://example.com/rvs/someurl?a=b
The above request to my service means "return the contents of someurl reversed". To make things a bit more complicated, there may be other stuff in my URLs after someurl: the argument a=b in my example above.

So my question is, how do I correctly encode someurl? Let's say it's http://google.com/ I'm reversing. I'd think the request would contain a percent-quoted URL, something like

http://example.com/rvs/http%3A%2F%2Fgoogle.com%2F?a=b
However, the CGI standard (and Apache2's implementation thereof) seems to be decoding the URL before it gets to my application. Ie: PATH_INFO contains
http://google.com/
not
http%3A%2F%2Fgoogle.com%2F
I can work around this (REQUEST_URI isn't decoded), but something about all this makes me uneasy. I'm relying on all web software between me and my client to not mess with my carefully encoded URL. If the CGI standard itself seems to think decoding is OK, who's to say some proxy or browser won't too? Or to ask the question more existentially, do these two URLs name the same resource? Or can they name different things?
http://example.com/foo/bar http://example.com/foo%2Fbar

Is there a good way to talk about URLs inside URLs? If you know the answer, mail me. I promise to share.

tech
  2006-08-21 00:11 Z