分号作为 URL 查询分隔符 [英] Semicolon as URL query separator

查看:54
本文介绍了分号作为 URL 查询分隔符的处理方法,对大家解决问题具有一定的参考价值,需要的朋友们下面随着小编来一起学习吧!

问题描述

虽然强烈推荐(W3C 源,通过 维基百科) 用于支持分号作为分隔符的 Web 服务器URL查询项(除了&号),好像一般都没有.

例如比较

http://www.google.com/search?q=nemo&oe=utf-8

http://www.google.com/search?q=nemo;oe=utf-8

结果.(在后一种情况下,分号是,或者在撰写本文时,被视为普通字符串字符,就好像 url 是:http://www.google.com/search?q=nemo%3Boe=utf-8)

虽然我尝试了第一个 URL 解析库,但表现良好:

<预><代码>>>>从 urlparse 导入 urlparse, query_qs>>>url = 'http://www.google.com/search?q=nemo;oe=utf-8'>>>parse_qs(urlparse(url).query){'q': ['nemo'], 'oe': ['utf-8']}

接受分号作为分隔符的当前状态是什么,有哪些潜在问题或一些有趣的注意事项?(从服务器和客户端的角度来看)

解决方案

1999 年的 W3C 建议 已过时.当前状态,根据 2014 W3C 建议,分号现在非法作为参数分隔符:

<块引用>

要解码 application/x-www-form-urlencoded 有效载荷,应使用以下算法.[...] 该算法的输出是名称-值对的排序列表.[...]

  1. 让字符串成为在 U+0026 AMPERSAND 字符 (&) 上严格拆分字符串有效负载的结果.

换句话说,?foo=bar;baz 意味着参数foo 的值为bar;baz;而 ?foo=bar;baz=sna 应该导致 foo 成为 bar;baz=sna(虽然技术上是非法的,因为第二个 = 应该转义为 %3D).

Although it is strongly recommended (W3C source, via Wikipedia) for web servers to support semicolon as a separator of URL query items (in addition to ampersand), it does not seem to be generally followed.

For example, compare

        http://www.google.com/search?q=nemo&oe=utf-8

        http://www.google.com/search?q=nemo;oe=utf-8

results. (In the latter case, semicolon is, or was at the time of writing this text, treated as ordinary string character, as if the url was: http://www.google.com/search?q=nemo%3Boe=utf-8)

Although the first URL parsing library i tried, behaves well:

>>> from urlparse import urlparse, query_qs
>>> url = 'http://www.google.com/search?q=nemo;oe=utf-8'
>>> parse_qs(urlparse(url).query)
{'q': ['nemo'], 'oe': ['utf-8']}

What is the current status of accepting semicolon as a separator, and what are potential issues or some interesting notes? (from both server and client point of view)

解决方案

The W3C Recommendation from 1999 is obsolete. The current status, according to the 2014 W3C Recommendation, is that semicolon is now illegal as a parameter separator:

To decode application/x-www-form-urlencoded payloads, the following algorithm should be used. [...] The output of this algorithm is a sorted list of name-value pairs. [...]

  1. Let strings be the result of strictly splitting the string payload on U+0026 AMPERSAND characters (&).

In other words, ?foo=bar;baz means the parameter foo will have the value bar;baz; whereas ?foo=bar;baz=sna should result in foo being bar;baz=sna (although technically illegal since the second = should be escaped to %3D).

这篇关于分号作为 URL 查询分隔符的文章就介绍到这了,希望我们推荐的答案对大家有所帮助,也希望大家多多支持IT屋!

查看全文
登录 关闭
扫码关注1秒登录
发送“验证码”获取 | 15天全站免登陆