Files and URLs

8

I was working on some stuff today that converts between File and URL and someone pointed out that the “obvious” ways to convert between them in the JDK are kind of broken with respect to urls/paths with spaces in them.

The obvious way to get from a File to a URL is with File.toURL(). However, the javadoc for this method notes that it does not automatically escape characters that are illegal in URLs (like spaces to %20). In fact, you’ll note that newly in JDK 1.6, this method is actually deprecated. They recommend to instead do file.toURI().toURL().

Going the other way, it’s quite tempting to do URL.getPath() (assuming you already know you’re dealing with a file:// url). But in this case, the escape characters are not properly undone – spaces are the most common issue. Here, Apache Commons IO has a FileUtils class that can come in handy, in particular the toFile() method.

Commons IO also has a helper for converting Files to URLs but the code there seems to be erroneously using the same bad File.toURL(), so you might want to stay away from that and use the toURI().toURL() technique above. I logged this at Apache as IO-163.

Comments

8 Responses to “Files and URLs”
  1. Fred says:

    Got your blog an hour too late…
    I just wasted my time solving a stupid bug I had converting file name to url.
    Of course I forgot Windows backslash :(

  2. Willem says:

    Can’t you use the File(URI) constructor to convert a URL to a File ?

  3. Martin says:

    Sorry, my comment’s got nothing to do with the subject of your blog entry, but according to Josh Bloch the URL class is defect in that it’s equals() and hashCode() methods are broken. He recommends to always use URI instead. Of course I don’t know for what you need the URL class eventually, but I thought that might interest you if you didn’t know that fact already.

    Martin

  4. Alex says:

    @Willem – yep, that should work too!

    @Martin – I am aware. I’m actually moving mostly away from URLs to Files anyways but we were already wrappering our URLs in the one case where equals and hashcode came up.

  5. Arman says:

    Wouldn’t java.net.URLEncoder/Decoder do the job?

  6. Alex says:

    URLEncoder is more oriented at encoding HTML forms into URLs. I’m not sure if it would also cover the escaping of chars in the url path itself. Certainly possible…

  7. Eirik says:

    what about file.toURI().toURL().externalForm().replaceAll(“%20″, ” “);
    ?
    It sure looks ugly, but i’ve done it quite recently and it works far better than it looks.

  8. Alex says:

    Huh? What’s the point of that? That gives you a URL string that’s broken as it lacks the proper escaping.

Speak Your Mind

Tell us what you're thinking...
and oh, if you want a pic to show with your comment, go get a gravatar!