斜线(“/")是否等同于 HTTP URL 路径部分中的编码斜线(“%2F") [英] Is a slash ("/") equivalent to an encoded slash ("%2F") in the path portion of an HTTP URL

查看:50
本文介绍了斜线(“/")是否等同于 HTTP URL 路径部分中的编码斜线(“%2F")的处理方法,对大家解决问题具有一定的参考价值,需要的朋友们下面随着小编来一起学习吧!

问题描述

我有一个处理/"的网站和%2F"在 URL 的路径部分(不是查询字符串)中不同.根据 RFC 或现实世界,这是一件坏事吗?

I have a site that treats "/" and "%2F" in the path portion (not the query string) of a URL differently. Is this a bad thing to do according to either the RFC or the real world?

我之所以这么问是因为我一直对我使用的 Web 框架(Ruby on Rails)以及它下面的层(例如乘客、Apache,我必须为 Apache 启用ALLOW_ENCODED_SLASHES")感到惊讶.我现在倾向于完全摆脱编码斜杠,但我想知道是否应该提交错误报告,因为我看到涉及编码斜杠的奇怪行为.

I ask because I keep running into little surprises with the web framework I'm using (Ruby on Rails) as well as the layers below that (Passenger, Apache, e.g., I had to enable "ALLOW_ENCODED_SLASHES" for Apache). I am now leaning toward getting rid of the encoded slashes completely, but I wonder if I should be filing bug reports where I see weird behavior involving the encoded slashes.

至于为什么我首先有编码斜杠,基本上我有这样的路线:

As to why I have the encoded slashes in the first place, basically I have routes such as this:

:controller/:foo/:bar

其中 :foo 类似于可以包含斜杠的路径.我认为最直接的做法就是对 URL 进行转义 foo,这样路由机制就会忽略斜线.现在我有疑问,很明显框架并不真正支持这一点,但根据 RFC,这样做是否有误?

where :foo is something like a path that can contain slashes. I thought the most straightforward thing to do would be to just URL escape foo so the slashes are ignored by the routing mechanism. Now I am having doubts, and it's pretty clear that the frameworks don't really support this, but according to the RFC is it wrong to do it this way?

以下是我收集到的一些信息:

Here is some information I have gathered:

RFC 1738(网址):

RFC 1738 (URLs):

通常,当一个八位字节由一个字符表示时和它被编码时,一个 URL 具有相同的解释.但是,对于保留字符则不然:对为特定方案保留的字符进行编码可能会改变 URL 的语义.

Usually a URL has the same interpretation when an octet is represented by a character and when it encoded. However, this is not true for reserved characters: encoding a character reserved for a particular scheme may change the semantics of a URL.

RFC 2396(URI):

RFC 2396 (URIs):

这些字符被称为保留",因为它们在 URI 组件中的使用仅限于它们的保留用途.如果 URI 组件的数据与保留目的冲突,则必须在形成 URI 之前对冲突数据进行转义.

These characters are called "reserved", since their usage within the URI component is limited to their reserved purpose. If the data for a URI component would conflict with the reserved purpose, then the conflicting data must be escaped before forming the URI.

(这里转义是否意味着对保留字符进行编码之外的其他内容?)

(does escaping here mean something other than encoding the reserved character?)

RFC 2616 (HTTP/1.1):

RFC 2616 (HTTP/1.1):

保留"中的字符以外的字符和不安全"集(见 RFC 2396 [42])等价于它们的%".HEX 十六进制"编码.

Characters other than those in the "reserved" and "unsafe" sets (see RFC 2396 [42]) are equivalent to their ""%" HEX HEX" encoding.

还有这个错误报告 对于 Rails,他们似乎希望编码的斜杠表现不同:

There is also this bug report for Rails, where they seem to expect the encoded slash to behave differently:

是的,我希望得到不同的结果,因为它们指向不同的资源.

Right, I'd expect different results because they're pointing at different resources.

它正在根目录中寻找文字文件foo/bar".非转义版本正在寻找目录 foo 中的文件栏.

It's looking for the literal file 'foo/bar' in the root directory. The non escaped version is looking for the file bar within directory foo.

从 RFC 中可以清楚地看出,原始与编码对于非保留字符是等效的,但是保留字符的故事是什么?

It's clear from the RFCs that raw vs. encoded is the equivalent for unreserved characters, but what is the story for reserved characters?

推荐答案

从你收集的数据来看,我倾向于说 uri 中编码的/"意味着在 application/cgi 中再次被视为/"水平.

From the data you gathered, I would tend to say that encoded "/" in an uri are meant to be seen as "/" again at application/cgi level.

也就是说,例如,如果您将 apache 与 mod_rewrite 一起使用,它将不匹配期望斜线与其中包含编码斜线的 URI 的模式.但是,一旦调用了适当的模块/cgi/... 来处理请求,就由它来进行解码,例如,检索包含斜杠的参数作为 URI 的第一个组件.

That's to say, that if you're using apache with mod_rewrite for instance, it will not match pattern expecting slashes against URI with encoded slashes in it. However, once the appropriate module/cgi/... is called to handle the request, it's up to it to do the decoding and, for instance, retrieve a parameter including slashes as the first component of the URI.

如果您的应用程序随后使用这些数据来检索文件(其文件名包含斜杠),那可能是一件坏事.

If your application is then using this data to retrieve a file (whose filename contains a slash), that's probably a bad thing.

总而言之,我发现在/"或%2F"中看到行为差异是完全正常的,因为它们的解释将在不同级别完成.

To sum up, I find it perfectly normal to see a difference of behaviour in "/" or "%2F" as their interpretation will be done at different levels.

这篇关于斜线(“/")是否等同于 HTTP URL 路径部分中的编码斜线(“%2F")的文章就介绍到这了,希望我们推荐的答案对大家有所帮助,也希望大家多多支持IT屋!

查看全文
登录 关闭
扫码关注1秒登录
发送“验证码”获取 | 15天全站免登陆