我应该为 HTTP 标头使用什么字符编码? [英] What character encoding should I use for a HTTP header?

查看:25
本文介绍了我应该为 HTTP 标头使用什么字符编码?的处理方法,对大家解决问题具有一定的参考价值,需要的朋友们下面随着小编来一起学习吧!

问题描述

我正在使用有趣"的 HTML 特殊字符 (✰)(参见 http://html5boilerplate.com/ 了解更多信息)用于 Server HTTP 标头,我想知道它是否符合规范.

I'm using a "fun" HTML special-character (✰)(see http://html5boilerplate.com/ for more info) for a Server HTTP-header and am wondering if it is "allowed" per spec.

  • 在 Windows Xp Pro SP 3 上的 Chrome 中使用开发工具中的网络选项卡,我看到 ✰ 很好.

  • Using the Network Tab in the dev tools in Chrome on Windows Xp Pro SP 3 I see the ✰ just fine.

在 IE8 中,✰ 正确呈现.

In IE8 the ✰ is not rendered correctly.

w3.org HTML 验证器没有正确地呈现它(而是显示").

The w3.org HTML validator does not render it correctly (displays "â°" instead).

现在,我不太热衷于字符编码……坦率地说,我并不太关心它们;我只是盲目地使用我被告知的 UTF-8 cus.:-)

Now, I'm not too keen on character encodings ... and frankly I don't really care too much about them; I just blindly use UTF-8 cus I'm told to. :-)

差异是由不同的解析器/浏览器/引擎/(无论它们叫​​什么)中的错误引起的吗?

是否有对此的规范,或者可能是 HTTP 标头值"的允许字符列表?

推荐答案

简而言之:保证只有 ASCII 可以工作.一些非 ASCII 字节允许向后兼容,但不应该是可显示的.

In short: Only ASCII is guaranteed to work. Some non-ASCII bytes are allowed for backwards compatibility, but are not supposed to be displayable.

HTTPbis 放弃 并在标题中指定除了 ASCII 没有其他有用的编码:

HTTPbis gave up and specified that in the headers there is no useful encoding besides ASCII:

从历史上看,HTTP 允许在ISO-8859-1 字符集 [ISO-8859-1],仅支持其他字符集通过使用 [RFC2047] 编码.在实践中,大多数 HTTP 标头字段值仅使用 US-ASCII 字符集 [USASCII] 的子集.新定义的头域应该限制它们的域值US-ASCII 八位字节.收件人应该处理字段中的其他八位位组内容(obs-text)作为不透明数据.

Historically, HTTP has allowed field content with text in the ISO-8859-1 charset [ISO-8859-1], supporting other charsets only through use of [RFC2047] encoding. In practice, most HTTP header field values use only a subset of the US-ASCII charset [USASCII]. Newly defined header fields SHOULD limit their field values to US-ASCII octets. A recipient SHOULD treat other octets in field content (obs-text) as opaque data.


此前,1999 年的 RFC 2616 对此进行了定义:


Previously, RFC 2616 from 1999 defined this:

*TEXT 的单词可能包含来自 ISO- 以外的字符集的字符8859-1 [22] 仅在根据 RFC 2047 [14] 的规则编码时.

Words of *TEXT MAY contain characters from character sets other than ISO- 8859-1 [22] only when encoded according to the rules of RFC 2047 [14].

而 RFC 2047 是 MIME 编码,所以应该是:

and RFC 2047 is the MIME encoding, so it'd be:

=?UTF-8?Q?=E2=9C=B0?=

但我认为没有多少(如果有的话)客户支持它.

but I don't think that many (if any) clients support it.

这篇关于我应该为 HTTP 标头使用什么字符编码?的文章就介绍到这了,希望我们推荐的答案对大家有所帮助,也希望大家多多支持IT屋!

查看全文
登录 关闭
扫码关注1秒登录
发送“验证码”获取 | 15天全站免登陆