URL parsing/normalisation/escaping/unescaping is a minefield. There are many edge cases where every implementation does things differently. This is a perfect example.
It gets worse if you are mapping URLs to a filesystem (e.g. for serving files). Even though they look similar, URL paths have different capabilities and rules than filesystems, and different filesystems also vary. This is also an example of that (I don't think most filesystems support empty directory names).
PunchyHamster 25 minutes ago [-]
We cut those and few others coz historically there were exploits relying on it
Nothing on web is "correct", deal with it
sfeng 26 minutes ago [-]
What I’ve learned in doing this type of normalization is whatever the specification says, you will always find some website that uses some insane url tweak to decide what content it should show.
mjs01 55 minutes ago [-]
// is useful if the server needs to serve both static files in the filesystem, and embedded files like a webpage.
// can be used for embedded files' URL because they will never conflict with filesystem paths.
PunchyHamster 25 minutes ago [-]
....just serve it from other paths
WesolyKubeczek 1 hours ago [-]
It is probably “incorrect”, but given the established actual usage over the decades, it’s most likely what you need to do nevertheless.
Not doing it is like punishing people for not using Oxford commas, or entering an hour long debate each time someone writes “would of” instead of “would have”. It grinds my gears too, but I have different hills to die on.
bazoom42 29 minutes ago [-]
If different clients does it differently, you have incompatibilies. This punishes everybody. Since normalizing // to / removes information which may be significant, the obviously correct choice is folllowing the spec.
PunchyHamster 25 minutes ago [-]
if it is significant, you coded your app wrong, plain and simple
jeroenhd 11 minutes ago [-]
Of course not. It's an explicit feature part of every specification.
Plenty of websites rewrite paths like /a/b/c/d into a backend service call like /?w=a&x=b&y=c&z=d. In that scheme, /a//c/d would rewrite to /?w=a&x=&y=c&z=d, something entirely distinct from /a/c/d working out to /?w=a&x=b&y=c
It's not the application's fault that the people attempting to configure web server URLs don't know how web server URLs work.
Etheryte 58 minutes ago [-]
Not sure I agree. The correct thing is to not mess with the URL at all if you're unsure about what to be doing to it. Doing nothing is the easiest thing of them all, why not do that?
j16sdiz 33 minutes ago [-]
because the you need some consistency or normalisation before applying ACL or do routing?
jeroenhd 9 minutes ago [-]
URL normalization is defined and it doesn't include collapsing slashes.
Not that you can include custom normalization rules (like collapsing slashes, tolower()ing the entire path, removing the query part of the URL), but that's not part of the standard. If you're doing anything extra, the risk of breaking stuff is on you.
Rendered at 09:08:26 GMT+0000 (Coordinated Universal Time) with Vercel.
It gets worse if you are mapping URLs to a filesystem (e.g. for serving files). Even though they look similar, URL paths have different capabilities and rules than filesystems, and different filesystems also vary. This is also an example of that (I don't think most filesystems support empty directory names).
Nothing on web is "correct", deal with it
Not doing it is like punishing people for not using Oxford commas, or entering an hour long debate each time someone writes “would of” instead of “would have”. It grinds my gears too, but I have different hills to die on.
Plenty of websites rewrite paths like /a/b/c/d into a backend service call like /?w=a&x=b&y=c&z=d. In that scheme, /a//c/d would rewrite to /?w=a&x=&y=c&z=d, something entirely distinct from /a/c/d working out to /?w=a&x=b&y=c
It's not the application's fault that the people attempting to configure web server URLs don't know how web server URLs work.
Not that you can include custom normalization rules (like collapsing slashes, tolower()ing the entire path, removing the query part of the URL), but that's not part of the standard. If you're doing anything extra, the risk of breaking stuff is on you.