在HTTP标头内是unicode用户代理合法吗? [英] Is a unicode user agent legal inside an HTTP header?
问题描述
我正在维护的应用程序使用'latin1'字符集将从Web日志中提取的用户代理加载到MySQL表列中。有时,它无法加载如下所示的用户代理:
An application I'm maintaining loads user agents extracted from web logs into a MySQL table column using the 'latin1' charset. Occasionally, it fails to load a user agent that looks like this:
Mozilla / 5.0(I??CPU iPhone OS 5_0_1,如Mac OS X )AppleWebKit / 534.46(像Gecko一样的KHTML ^ C)版本
我怀疑它在上窒息了?
。我正在努力弄清楚是否应该支持它,或者它是否是上游日志记录系统引入的损坏。这是HTTP标头中的合法用户代理吗?
I suspect it's choking on Iâ?
. I'm working to figure out if this should be supported, or if it's corruption introduced by the upstream logging system. Is this a legal user agent in a HTTP header?
推荐答案
RFC 2616(HTTP 1.1)说邮件标题内容必须由 * TEXT
组成或者令牌,分隔符和带引号的字符串的组合。如果您查看TEXT等的定义,会发现合法字符是字节值不在[0,31]范围内且不等于127的字符;因此,根据规范,â
等字符是我所能说的合法。
RFC 2616 (HTTP 1.1) says that message header contents must be "consisting of either *TEXT
or combinations of token, separators, and quoted-string". If you look at the definitions for TEXT etc you will find that legal characters are those with byte values not in the [0, 31] range and not equal to 127; therefore characters such as â
are as far as I can tell legal as per the spec.
这篇关于在HTTP标头内是unicode用户代理合法吗?的文章就介绍到这了,希望我们推荐的答案对大家有所帮助,也希望大家多多支持IT屋!