对于带有建议字符数据的Content-Types的HTTP响应,如果没有指定,客户端应该采用哪个字符集? [英] For HTTP responses with Content-Types suggesting character data, which charset should be assumed by the client if none is specified?

查看:184
本文介绍了对于带有建议字符数据的Content-Types的HTTP响应,如果没有指定,客户端应该采用哪个字符集?的处理方法,对大家解决问题具有一定的参考价值,需要的朋友们下面随着小编来一起学习吧!

问题描述

如果在Content-Type标头中未指定charset参数, RFC2616 section 3.7.1 似乎暗示对于text子类型的媒体类型应该假设ISO8859-1:


charset参数是发件人提供的
,text类型的媒体子类型
被定义为具有
a默认字符集值
ISO-8859-1 HTTP。



除了
以外的字符集中的数据ISO-8859-1或其子集必须是标有相应字符集的

value。


然而,我经常看到提供Javascript文件的应用程序使用Content-Type值,如application / x-javascript (即没有字符集参数),即使这些脚本包含非ASCII的UTF-8字符,如果解释为ISO8859-1会损坏。



这不是似乎对客户造成问题。客户端如何知道将字节解释为UTF-8?是否有其他字符数据子类型的规则,意味着UTF-8应该是默认的?

解决方案

我已检查的所有主要浏览器(IE,FF和Opera)完全忽略

如果您对算法感兴趣的数据自动检测字符集,请查看 Mozilla Firefox 链接。



只是一个小笔记内容类型:只有文字包含字符集。可以合理地假设浏览器处理application / x-javascript和处理文本/ javascript相同(除了IE6,但是另一个主题)。



会使用默认的字符集(可能存储在注册表中),如下所示:


默认情况下,Internet Explorer使用
在服务器返回的HTTP
内容类型中指定的字符集到
确定此转换。如果没有给出这个
参数,Internet
Explorer使用
文档中元元素指定的字符集
如果没有指定元素
,它会使用用户的
首选项


http:/ /msdn.microsoft.com/en-us/library/ms537500%28VS.85%29.aspx



Mozilla Firefox 尝试自动检测字符集,如下所示:


本文介绍了三种类型的自动检测方法来确定文档的编码< : http://www.mozilla.org/projects/intl/UniversalCharsetDetection.html



Opera 也使用自动侦测功能,如下所示:


传输协议提供使用的编码名称。如果没有,Opera会查看页面的charset声明。 如果缺失,Opera会尝试自动检测编码,使用域名查看脚本是否为CJK脚本,如果是,请选择哪个脚本。

/www.opera.com/docs/specs/opera9/rel =nofollow noreferrer> http://www.opera.com/docs/specs/opera9/


If no charset parameter is specified in the Content-Type header, RFC2616 section 3.7.1 seems to imply ISO8859-1 should be assumed for media types of subtype "text":

When no explicit charset parameter is provided by the sender, media subtypes of the "text" type are defined to have a default charset value of "ISO-8859-1" when received via HTTP.

Data in character sets other than "ISO-8859-1" or its subsets MUST be labeled with an appropriate charset value.

However, I routinely see applications that serve up Javascript files with Content-Type values like "application/x-javascript" (i.e. no charset param), even when these scripts contain non-ASCII UTF-8 characters, which would be corrupt if interpreted as ISO8859-1.

This does not seem to pose problems to clients. How do clients know to interpret the bytes as UTF-8? Is there a rule for other character-data subtypes that implies UTF-8 should be the default? Where is this documented?

解决方案

All major browsers I've checked (IE, FF and Opera) completely ignore the RFC specification in this part.

If you are interested in the algorithm to auto-detect charset by data, look at Mozilla Firefox link.

Just a small note about content types: Only text has character sets. It's reasonable to assume that browsers handle application/x-javascript the same as they handle text/javascript ( except IE6, but that's another subject ).

Internet Explorer will use the default charset (probably stored at registry), as noted:

By default, Internet Explorer uses the character set specified in the HTTP content type returned by the server to determine this translation. If this parameter is not given, Internet Explorer uses the character set specified by the meta element in the document. It uses the user's preferences if no meta element is specified.

Source: http://msdn.microsoft.com/en-us/library/ms537500%28VS.85%29.aspx

Mozilla Firefox attempts to auto-detect the charset, as pointed here:

This paper presents three types of auto-detection methods to determine encodings of documents without explicit charset declaration.

Source: http://www.mozilla.org/projects/intl/UniversalCharsetDetection.html

Opera uses auto-detection too, as documented:

If the transport protocol provides an encoding name, that is used. If not, Opera will look at the page for a charset declaration. If this is missing, Opera will attempt to auto-detect the encoding, using the domain name to see if the script is a CJK script, and if so which one. Opera can also auto-detect UTF-8.

Source: http://www.opera.com/docs/specs/opera9/

这篇关于对于带有建议字符数据的Content-Types的HTTP响应,如果没有指定,客户端应该采用哪个字符集?的文章就介绍到这了,希望我们推荐的答案对大家有所帮助,也希望大家多多支持IT屋!

查看全文
登录 关闭
扫码关注1秒登录
发送“验证码”获取 | 15天全站免登陆