为什么我必须通过< script>指定字符集属性标签? [英] Why must I specify charset attributes for by <script> tags?

查看:128
本文介绍了为什么我必须通过< script>指定字符集属性标签?的处理方法,对大家解决问题具有一定的参考价值,需要的朋友们下面随着小编来一起学习吧!

问题描述

我有一个奇怪的情况:

  1. 主要HTML页面以UTF-16字符集提供(由于此问题超出了某些要求)
  2. HTML页面使用<script>标记加载外部脚本(即它们具有src属性)
  3. 这些外部脚本使用US-ASCII/UTF-8
  4. Web服务器正在提供内容类型为"application/javascript"且没有字符集提示的脚本
  5. 脚本没有字节顺序标记(BOM)
  1. Main HTML page is served in UTF-16 character set (due to some requirements out-of-scope for this question)
  2. HTML page uses <script> tags to load external scripts (i.e. they have src attributes)
  3. Those external scripts are in US-ASCII/UTF-8
  4. The web server is serving the scripts with the content-type "application/javascript" with no character set hints
  5. The scripts have no byte-order-mark (BOM)

在加载上述页面时,Firefox和Chrome(当前版本)均抛出错误,表明脚本文件的第一个字符无效.

When loading the page described above, both Firefox and Chrome (current versions) throw errors saying that the first character of the script files are invalid.

查看各个dev-tools视图的网络"标签,显示文件就很好(它们在预览器中的渲染就很好).

Looking at the "Network" tabs of the respective dev-tools views shows the files are just fine (they render in the previewer just fine).

我的结论是,浏览器对于整个页面"或类似的愚蠢编码应该感到困惑.

My conclusion was that the browsers are becoming confused as to what the encoding should be for "the whole page" or some similar foolishness.

所以我尝试在<script>标记中添加charsrt="UTF-8"属性,这似乎可以解决问题.

So I tried adding a charsrt="UTF-8" attribute to the <script> tags and that seems to solve the problem.

但是我真的不必这样做,对吗?

But I really shouldn't have to do that, should I?

首先,服务器正在告诉客户端文档的类型是什么.它是application/javascript,并且未指定字符集. (实际上, RFC 表示charset仅适用于text/* MIME类型).好的,我可以理解为什么那里可能会有一些歧义.

First of all, the server is telling the client what the document's type is. It's application/javascript and doesn't specify a character set. (Indeed, the RFC says that charset is only applicable to text/* MIME-types). Okay, I can understand why there might be some ambiguity, there.

但是文档类型是javascript,对于如何处理您不知道其实际字符集的javascript文件,有一些显而易见的规则.例如,如果有BOM,则使用它.如果没有BOM,那么从UTF-8区分UTF-16应该真的很容易. (请注意,在加载CSS文件的这些页面上似乎没有任何问题,它们与脚本的处境相同.)

But the document-type is javascript, and there are some obvious rules for how to handle a javascript file whose actual charset you don't know. For example, if it's got a BOM, then use it. If there isn't any BOM, it should be really easy to tell UTF-16 from UTF-8. (Note that there doesn't seem to be any problem on these same pages with loading CSS files, which are also in the same situation as the scripts.)

最后,封闭页面不必知道其依赖项的编码是什么.实际上,要知道它可能是不可能,并明确指定charset然后将页面与其依赖关系紧密耦合,反之亦然.

Lastly, the enclosing page shouldn't have to know what the encoding of its dependencies are. In fact, it might be impossible for it to know, and explicitly-specifying the charset then tightly-couples the page to its dependencies and vice-versa.

是否有一种方法可以使浏览器正确检测这些依赖项的字符集,而无需在页面本身中指定charset?

Is there a way to get the browser to correctly-detect the character set of these dependencies without specifying the charset in the page itself?

推荐答案

在文件中没有BOM或文件<script>Content-Type中没有明确的charset的情况下,文件的编码不明确.浏览器可能采用UTF-8(并且应按照 RFC 4329 >),但是如果脚本包含任何未以UTF-8实际编码的非ASCII字符,则文件将无法正确处理.

Without a BOM in the file, or an explicit charset in the <script> or Content-Type for the file, the encoding of the file is ambiguous. The browser might assume UTF-8 (and should, per RFC 4329), but if the script contains any non-ASCII characters that are not actually encoded in UTF-8, the file won't process properly.

但是,HTML 5第4.11节规定,如果<script>没有charset属性,则<script>的后备编码是文档的编码.如果没有BOM或charset来指定文件的实际编码,则后备生效.

However, HTML 5 Section 4.11 dictates that a <script>'s fallback encoding is the document's encoding if the <script> does not have a charset attribute. The fallback takes effect if there is no BOM or charset to specify the file's actual encoding.

因此,请确保您的HTML和JS文件始终使用相同的编码,否则,您必须明确说明JS文件的charset,以另一种方式.

So, either make sure your HTML and JS files are always using the same encoding, or else you have to be explicit about the JS file's charset, one way or the other.

这篇关于为什么我必须通过&lt; script&gt;指定字符集属性标签?的文章就介绍到这了,希望我们推荐的答案对大家有所帮助,也希望大家多多支持IT屋!

查看全文
登录 关闭
扫码关注1秒登录
发送“验证码”获取 | 15天全站免登陆