RFC3986,URI / URL中的反斜杠 [英] RFC3986, backslash in URI/URLs

查看:127
本文介绍了RFC3986,URI / URL中的反斜杠的处理方法,对大家解决问题具有一定的参考价值,需要的朋友们下面随着小编来一起学习吧!

问题描述



[抱歉,没有新闻组讨论这样的网址 - 这个

似乎是一个合理的主题讨论它的地方......?]


到目前为止的故事:在一些不相关的新闻组中,我的注意力

落在URL上:
http://www.speedtouchdsl.com/prod706.htm

其中包含一个链接到声称的URL:
http:// www。 speedtouchdsl.com/pdf\dat...06WL-780WL.pdf


将后者与该地区的其他网址进行比较,似乎是

\可能是/的错误。但是,由于他们的网站是bb服务器是IIS,看起来他们的服务器默默地修复了这个

错误[1],并提供了预期的内容。我对

RFC1738的回忆是一个未编码的\不应该出现在URL中,所以我最初倾向于将此URL评为已损坏...


然而,这导致我失去了RFC2396的踪迹,它''更新

并合并统一资源定位器 [RFC1738]和相对统一

资源定位器 [RFC1808]''和RFC3986,废弃rfc 1808

并更新rfc 1738。


在RFC2396 2.4.3中,反斜杠是在不明智的子类别下的排除的US-ASCII

字符下列出,并且必须

要求:


|必须转义与排除字符对应的数据,以便在URI中正确表示
|


到目前为止,所以好的。


但是在RFC3986中,这个字符\好像已经悄悄地从要删除的字符列表中删除了
。我发现在附录DRFC2396的变更中没有提及这一变化。


唯一实质性的提及\我可以在第7节的主题标题下找到第7.3节

。安全注意事项:


|在URI路径解释过程中应特别小心

|涉及使用后端文件系统或相关系统

|功能。文件系统通常赋予

|的操作含义特殊字符,例如/,\,":"," ["和"]"


Aside从这种潜在的安全风险来看,在我看来,

引用的URL,我想将其归类为有缺陷的,将这个最新的RFC评为OK。 。并且由于服务器返回了这个错误的URL所需的

所需资源,我甚至无法将b b b评为错误 - 我可以吗?


有什么建议为什么这个显然有风险,恕我直言不受欢迎,

的变化被偷运到RFC而没有在变化中提及它?


http://lists.w3 .org / Archives / Public / ... 5May / 0004.html

我发现了一个热门帖子回答这是坚决无视

底部引用的问题 -


| >不应该反斜杠本身包括在必须逃脱的

|中>清单?


难道不是吗?


问候


[1]当然,这不是我在自己使用Apache进行服务的情况下遇到的情况。如果作者编码\,而不是URL中的/>

,并尝试使用符合www的

浏览器的链接,该链接不起作用。但是,如果他们使用IE,那么它会显示它在*客户端*端默默修复错误。它似乎从我的测试中看出,IE6没有尝试直接访问引用的URL

- 它取代了\通过/甚至在尝试之前(而

Mozilla用%5C替换\,之后,Apache,他说不)。


所以看起来好像MS在这个fuxup上给了自己两个小叮当:一次是

他们类似浏览器的对象,一次是在他们的网络服务器上。


(如果他们使用MS软件作为他们网页的唯一测试,作者被误导的另一个原因。但我离题了。)


- -

解决方案

Alan J. Flavell询问:


http://lists.w3.org/Archives/Public/。 ..5May / 0004.html
我发现了一个热门帖子回答这是坚决无视底部引用的问题 -

| >不应该反斜杠本身包含在必须逃脱的|中>
清单?

不应该吗?


我认为应该。我认为这很明显[1];无论是否使用

语言,如果转义字符为正常,则应该始终转义转义字符

某些表达式中的值。

从我的测试中可以看出IE6没有直接访问引用的
URL - 它取代了\通过/在尝试之前




是的,IE在通过互联网发送

之前以多种方式破坏了地址栏中的URL。 />

-

杰克。


[1]表达式显然......并且很明显......当作者即将犯下一些无意的谬误时,经常会遇到


Jack写道:

Alan J. Flavell询问:


http://lists.w3.org/Archives/Public/...5May/0004.html ,我找到了一个热门帖子回答这是坚决无视底部引用的问题 -

| >不应该反斜杠本身包含在必须逃脱的|中>
清单?

不应该吗?



我认为应该这样做。我认为这很明显[1];无论语言如何,如果转义字符是正常的话,它本身应该总是被转义。某些表达式中的值。




你能说不论语言,假设哪些字符

是转义字符完全取决于正在使用什么语言?


Alan J. Flavell写道:

唯一实质性的提及\我可以在第7节的主题下找到第7.3节。安全注意事项:

|在URI路径解释过程中应特别小心
|涉及使用后端文件系统或相关系统
|功能。文件系统通常为
|分配操作含义特殊字符,例如/,\,:",[&;和]

除了这种潜在的安全性曝光,在我看来,引用的URL,我想归类为有缺陷的,将被这个最新的RFC评为好。并且由于服务器在出现这个错误的URL时返回了所需的资源,我甚至无法将其评为错误 - 我可以吗?

任何建议为什么这样显然有风险,恕我直言不受欢迎,
变更被走私到RFC而没有在变化中提及它?




我可以问一下风险来源是什么?你提到反斜杠是后端文件系统上的路径分隔符,但这不是

问题,因为正斜杠是路径分隔符在其他文件上

系统,并将URI中的正斜杠解释为路径

分隔符不会对该帐户造成风险。



[Sorry, there isn''t a newsgroup for discussing URLs as such - this
seemed a reasonably on-topic place to discuss it...?]

The story so far: on somewhat unrelated newsgroup, my attention
fell upon the URL:
http://www.speedtouchdsl.com/prod706.htm
which contains a link to the purported URL:
http://www.speedtouchdsl.com/pdf\dat...06WL-780WL.pdf

Comparing the latter with other URLs in that area, it appeared that
the "\" was a probable blunder for "/". However, since their web
server is IIS, it appears that their server silently fixes-up this
blunder[1], and delivers the intended content. My recollection of
RFC1738 was that an unencoded "\" ought not to appear in a URL, so I
was initially inclined to rate this URL as broken...

However, this then led me down the trail of RFC2396, which ''updates
and merges "Uniform Resource Locators" [RFC1738] and "Relative Uniform
Resource Locators" [RFC1808]'', and RFC3986, which "obsoletes rfc 1808
and updates rfc 1738".

In RFC2396 2.4.3, the backslash is listed under "Excluded US-ASCII
characters", under the subcategory of "unwise", with the "must"
requirement:

|Data corresponding to excluded characters must be escaped in order to
|be properly represented within a URI.

So far, so good.

But in RFC3986, this character "\" seems to have been stealthily
dropped from the list of characters needint to be escaped. I find no
mention of this change in Appendix D, "Changes from RFC2396".

The only substantive mention of "\" which I can find is in section 7.3
under the main heading of "7. Security Considerations":

|Special care should be taken when the URI path interpretation process
| involves the use of a back-end file system or related system
| functions. File systems typically assign an operational meaning to
| special characters, such as the "/", "\", ":", "[", and "]"

Aside from this potential security exposure, it appears to me that the
cited URL, which I would like to have categorised as defective, would
be rated as OK by this latest RFC. And since the server returns the
desired resource when this misbegotten URL is presented, I can''t even
rate it as a blunder - can I?

Any suggestions why this apparently risky, and IMHO undesirable,
change was smuggled into the RFC without mentioning it in the changes?

In http://lists.w3.org/Archives/Public/...5May/0004.html ,
I found a top-posted "answer" which is resolutely ignoring the
bottom-quoted question -

| > Shouldn''t backslash itself be included in the must-be-escaped
| > list?

Shouldn''t it?

regards

[1] Of course, this isn''t a situation that I meet in my own
serveradmin-ing using Apache. If the author codes "\" instead of "/"
in a URL, and attempts to follow the link with a www-conforming
browser, the link does not work. If they use IE instead, however, it
appears that it silently fixes-up the error on the *client* side. It
seems from my tests that IE6 makes no attempt to access the cited URL
directly - it replaces the "\" by "/" before even trying (whereas
Mozilla replaces the "\" by "%5C", after which, Apache, he say "no").

So it looks as if MS give themselves two bites at this fuxup: once in
their browser-like object, and once in their web server.

(Another reason why authors are misguided if they use MS software as
their only test of their web pages. But I digress.)

--

解决方案

Alan J. Flavell inquired:


In http://lists.w3.org/Archives/Public/...5May/0004.html ,
I found a top-posted "answer" which is resolutely ignoring the
bottom-quoted question -

| > Shouldn''t backslash itself be included in the must-be-escaped | >
list?

Shouldn''t it?
I think it should. I think it''s rather obvious[1]; regardless of
language, an escape character should always itself be escaped if it is
to take its "normal" value in some expression.
It seems from my tests that IE6 makes no attempt to access the cited
URL directly - it replaces the "\" by "/" before even trying



Yes, IE mangles URLs from the address-bar in several ways before sending
them off over the interweb.

--
Jack.

[1] The expressions "obviously..." and "it''s obvious that..." are
frequently encountered when the author is about to perpetrate some
inadvertent fallacy.


Jack wrote:

Alan J. Flavell inquired:


In http://lists.w3.org/Archives/Public/...5May/0004.html , I
found a top-posted "answer" which is resolutely ignoring the
bottom-quoted question -

| > Shouldn''t backslash itself be included in the must-be-escaped | >
list?

Shouldn''t it?



I think it should. I think it''s rather obvious[1]; regardless of
language, an escape character should always itself be escaped if it is
to take its "normal" value in some expression.



How can you mean "regardless of language", given that which characters
are escape characters depends entirely on what language is in use?


Alan J. Flavell wrote:

The only substantive mention of "\" which I can find is in section 7.3
under the main heading of "7. Security Considerations":

|Special care should be taken when the URI path interpretation process
| involves the use of a back-end file system or related system
| functions. File systems typically assign an operational meaning to
| special characters, such as the "/", "\", ":", "[", and "]"

Aside from this potential security exposure, it appears to me that the
cited URL, which I would like to have categorised as defective, would
be rated as OK by this latest RFC. And since the server returns the
desired resource when this misbegotten URL is presented, I can''t even
rate it as a blunder - can I?

Any suggestions why this apparently risky, and IMHO undesirable,
change was smuggled into the RFC without mentioning it in the changes?



May I ask what the source of risk is? You mention that backslash being
the path delimiter on a back-end file system, but that can''t be the
problem, since the forward slash is the path delimiter on other file
systems, and the interpretation of the forward slash in URIs as a path
delimiter doesn''t create risk on that account.


这篇关于RFC3986,URI / URL中的反斜杠的文章就介绍到这了,希望我们推荐的答案对大家有所帮助,也希望大家多多支持IT屋!

查看全文
登录 关闭
扫码关注1秒登录
发送“验证码”获取 | 15天全站免登陆