与HTTP URL的路径部分中的编码斜线(“%2F”)相当的斜线(“/”) [英] Is a slash ("/") equivalent to an encoded slash ("%2F") in the path portion of an HTTP URL

查看:281
本文介绍了与HTTP URL的路径部分中的编码斜线(“%2F”)相当的斜线(“/”)的处理方法,对大家解决问题具有一定的参考价值,需要的朋友们下面随着小编来一起学习吧!

问题描述

我有一个网站在URL的路径部分(而不是查询字符串)中不同地对待/和%2F。这是根据RFC还是现实世界做坏事?



我问我是因为我正在使用的Web框架, Ruby on Rails)以及下面的层次(Passenger,Apache,例如,我必须为Apache启用ALLOW_ENCODED_SLASHES)。我现在倾向于完全摆脱编码的斜杠,但是我想知道我是否应该提交错误报告,我看到涉及编码斜杠的奇怪行为。



至于为什么我首先编码的斜杠,基本上我有这样的路由:

 :controller /:foo /: bar 

其中:foo是一个可以包含斜杠的路径。我认为最简单的做法是只是URL转义 foo ,所以斜线被路由机制忽略。现在我有疑问,很明显,这些框架并不真的支持这一点,但是根据RFC这样做是错误的呢?



这里是我收集的一些信息:



RFC 1738(URL):


通常,当字符由字符表示并且编码时,URL具有相同的解释。然而,这对于保留的字符是不正确的:编码为特定方案保留的字符可能会改变URL的语义。


RFC 2396(URI):


这些字符称为保留,因为它们在URI组件中的使用仅限于其保留目的。如果URI组件的数据与保留目的相冲突,则冲突的数据必须在形成URI之前被转义。


在这里转义是否意味着除编码保留字符以外的其他东西?)



RFC 2616(HTTP / 1.1):


除了保留和不安全集合之外的其他字符(见RFC 2396 [42])等同于它们的%HEX HEX编码。


还有一个这个漏洞报告的Rails,他们似乎期望编码的斜杠的行为不同:


对,我会期待不同的结果,因为他们指向不同的资源。



它正在寻找根文件'foo / bar'目录。非转义版本正在寻找目录foo中的文件栏。


从RFC中可以清楚,原始与编码是等价的对于未保留的字符,但是保留字符的故事是什么?

解决方案

从您收集的数据中,我倾向于说在uri中编码的/意味着在应用程序/ cgi级别再次被视为/。



这就是说,如果你使用apache例如, mod_rewrite ,它不会匹配模式预期的斜杠与URI与编码的斜杠。
但是,一旦调用适当的模块/ cgi / ...来处理请求,就要做解码,例如,将包含斜杠的参数检索为URI的第一个组件。 / p>

如果您的应用程序正在使用此数据来检索文件(其文件名包含斜杠),那可能是一件坏事。


$ b总结一下,我发现在/或%2F中看到行为差异是完全正常的,因为他们的解释将在不同的级别完成。


I have a site that treats "/" and "%2F" in the path portion (not the query string) of a URL differently. Is this a bad thing to do according to either the RFC or the real world?

I ask because I keep running into little surprises with the web framework I'm using (Ruby on Rails) as well as the layers below that (Passenger, Apache, e.g., I had to enable "ALLOW_ENCODED_SLASHES" for Apache). I am now leaning toward getting rid of the encoded slashes completely, but I wonder if I should be filing bug reports where I see weird behavior involving the encoded slashes.

As to why I have the encoded slashes in the first place, basically I have routes such as this:

:controller/:foo/:bar

where :foo is something like a path that can contain slashes. I thought the most straightforward thing to do would be to just URL escape foo so the slashes are ignored by the routing mechanism. Now I am having doubts, and it's pretty clear that the frameworks don't really support this, but according to the RFC is it wrong to do it this way?

Here is some information I have gathered:

RFC 1738 (URLs):

Usually a URL has the same interpretation when an octet is represented by a character and when it encoded. However, this is not true for reserved characters: encoding a character reserved for a particular scheme may change the semantics of a URL.

RFC 2396 (URIs):

These characters are called "reserved", since their usage within the URI component is limited to their reserved purpose. If the data for a URI component would conflict with the reserved purpose, then the conflicting data must be escaped before forming the URI.

(does escaping here mean something other than encoding the reserved character?)

RFC 2616 (HTTP/1.1):

Characters other than those in the "reserved" and "unsafe" sets (see RFC 2396 [42]) are equivalent to their ""%" HEX HEX" encoding.

There is also this bug report for Rails, where they seem to expect the encoded slash to behave differently:

Right, I'd expect different results because they're pointing at different resources.

It's looking for the literal file 'foo/bar' in the root directory. The non escaped version is looking for the file bar within directory foo.

It's clear from the RFCs that raw vs. encoded is the equivalent for unreserved characters, but what is the story for reserved characters?

解决方案

From the data you gathered, I would tend to say that encoded "/" in an uri are meant to be seen as "/" again at application/cgi level.

That's to say, that if you're using apache with mod_rewrite for instance, it will not match pattern expecting slashes against URI with encoded slashes in it. However, once the appropriate module/cgi/... is called to handle the request, it's up to it to do the decoding and, for instance, retrieve a parameter including slashes as the first component of the URI.

If your application is then using this data to retrieve a file (whose filename contains a slash), that's probably a bad thing.

To sum up, I find it perfectly normal to see a difference of behaviour in "/" or "%2F" as their interpretation will be done at different levels.

这篇关于与HTTP URL的路径部分中的编码斜线(“%2F”)相当的斜线(“/”)的文章就介绍到这了,希望我们推荐的答案对大家有所帮助,也希望大家多多支持IT屋!

查看全文
登录 关闭
扫码关注1秒登录
发送“验证码”获取 | 15天全站免登陆