如何在Web服务器上的请求URI中解码保留的转义字符? [英] How to decode a reserved escape character in a request URI on a web server?

查看：94 发布时间：2020/5/25 0:43:59 parsing url escaping uri percent-encoding

本文介绍了如何在Web服务器上的请求URI中解码保留的转义字符?的处理方法，对大家解决问题具有一定的参考价值，需要的朋友们下面随着小编来一起学习吧！

问题描述

很明显，Web服务器必须解码任何转义的未保留字符(例如，字母数字等)以进行URI比较.例如，http://www.example.com/~user/index.htm应与http://www.example.com/%7Euser/index.htm相同.

我的问题是，我们将如何处理转义的保留字符?

一个例子是%2F或/.如果请求URI中有%2F，Web服务器的解析器是否应将其替换为/?在上面的示例中，这意味着http://www.example.com/~user%2Findex.htm与http://www.example.com/~user/index.htm相同吗?尽管我在Apache服务器(Unix.2.2.17)上尝试了该命令，但看起来却出现了"404 Not Found"错误.

那么这意味着%2F和其他转义的保留字符是否应该保留(至少在URI比较之前)?

背景信息:

RFC 2616(HTTP 1.1)中有两个地方提到转义解码问题:

Request-URI以3.2.1节中指定的格式发送.如果使用％HEX HEX"编码[42]对请求URI进行编码，则源服务器必须对请求URI进行解码，以正确解释请求.服务器应使用适当的状态代码响应无效的Request-URI.

和

保留"和不安全"集中的字符(请参阅RFC 2396 [42])等同于其％" HEX HEX"编码.

(根据 http://trac.tools.ietf.org /wg/httpbis/trac/ticket/2 不安全"是一个错误，应从规范中删除.因此，我们仅在此处查看保留".)

仅供参考，RFC 2396中此类字符的定义:

reserved =;" | "/" | ?" | :" | "@" | &" | "=" | "+" | "$" | ，"

unreserved = alphanum |标记

mark =-" | "_" | ." | ！" | 〜" | "*" | ’" | (" |)"

解决方案

tl; dr:

解码百分比编码的未保留字符，
保留百分比编码的保留字符.

URI标准为 STD 66 ，当前为第6部分与归一化和比较有关，其中第6.2.2.2节解释了如何使用百分比编码的八位位组:

这些URI应该通过解码对应于未保留字符[…]的任何百分比编码的八位字节来规范化.

如第2部分中明确指出的那样: >

未保留的字符:

在将未保留的字符替换为其相应的百分比编码的US-ASCII八位字节时所不同的URI
保留的字符:

在保留字符替换为其相应的百分比编码八位字节方面不同的URI 不等价
.

It is pretty clear that a web server has to decode any escaped unreserved character (such as alphanums, etc.) to do the URI comparison. For example, http://www.example.com/~user/index.htm shall be identical to http://www.example.com/%7Euser/index.htm.

My question is, what are we gonna do with the escaped reserved characters?

An example would be %2F, or /. If there is an %2F in the request URI, should the parser of web server replace it with a /? In the above example, it would mean that http://www.example.com/~user%2Findex.htm would be the same as http://www.example.com/~user/index.htm? Although I tried it on an Apache server (2.2.17 Unix) and it looks like it gives a "404 Not Found" error.

So does that mean %2F and other escaped reserved characters shall be left alone (at least before the URI comparison)?

Background information:

There are two places in RFC 2616 (HTTP 1.1) mentioning the escape decoding issue:

The Request-URI is transmitted in the format specified in section 3.2.1. If the Request-URI is encoded using the "% HEX HEX" encoding [42], the origin server MUST decode the Request-URI in order to properly interpret the request. Servers SHOULD respond to invalid Request-URIs with an appropriate status code.

and

Characters other than those in the "reserved" and "unsafe" sets (see RFC 2396 [42]) are equivalent to their ""%" HEX HEX" encoding.

(according to http://trac.tools.ietf.org/wg/httpbis/trac/ticket/2 "unsafe" is a mistake and shall be removed from the spec. So we are only looking at "reserved" here.)

FYI, the definition of such characters in RFC 2396:

reserved = ";" | "/" | "?" | ":" | "@" | "&" | "=" | "+" | "$" | ","

unreserved = alphanum | mark

mark = "-" | "_" | "." | "!" | "˜" | "*" | "’" | "(" | ")"

解决方案

tl;dr:

Decode percent-encoded unreserved characters,
keep percent-encoded reserved characters.

The URI standard is STD 66, which currently is RFC 3986.

Section 6 is about Normalization and Comparison, where section 6.2.2.2 explains what to do with percent-encoded octets:

These URIs should be normalized by decoding any percent-encoded octet that corresponds to an unreserved character […]

As explicitly stated in section 2 (bold emphasis mine):

Unreserved characters:

URIs that differ in the replacement of an unreserved character with its corresponding percent-encoded US-ASCII octet are equivalent
Reserved characters:

URIs that differ in the replacement of a reserved character with its corresponding percent-encoded octet are not equivalent.

这篇关于如何在Web服务器上的请求URI中解码保留的转义字符?的文章就介绍到这了，希望我们推荐的答案对大家有所帮助，也希望大家多多支持IT屋！

查看全文

如何在Web服务器上的请求URI中解码保留的转义字符? [英] How to decode a reserved escape character in a request URI on a web server?

问题描述

背景信息:

Background information:

相关文章

其他开发最新文章

热门教程

热门工具

登录关闭

如何在Web服务器上的请求URI中解码保留的转义字符? [英] How to decode a reserved escape character in a request URI on a web server?

问题描述

背景信息:

Background information:

相关文章

其他开发最新文章

热门教程

热门工具

登录 关闭

登录关闭