PHP& HTML5:使用< meta>的UTF-8文档声明标记或通过header()函数? [英] PHP & HTML5: UTF-8 document declaration with <meta> tag or through the header() function?

查看:130
本文介绍了PHP& HTML5:使用< meta>的UTF-8文档声明标记或通过header()函数?的处理方法,对大家解决问题具有一定的参考价值,需要的朋友们下面随着小编来一起学习吧!

问题描述

我试图优化我的框架处理HTML5页面的方式。现在,我所做的是在< head> <>之后插入< meta charset =utf-8/> 标签,因此它是第一个要指定的元素(所以当我传递< title> 标签和其余页面元素时,它已经被定义作为UTF-8编码)。

问题是我正在阅读一些关于网站性能优化的书籍,其中大多数人建议通过 Content-type 声明,而不是插入< meta> 块。



关于字符编码检测的W3C文档(第8.2.2.1节)说,实质上,如果用户通过用户代理声明了内容类型的覆盖,HTTP标头的优先级高于任何明确声明。



然而,W3C验证器(这就是为什么我用来调试我的HTML代码输出)不投诉,但警告我缺少< meta字符集= UTF-8/> ; block,因此鼓励我说出来(它说,如果渲染的页面被保存,这是特别推荐的,但事实并非如此,但它使我感到困惑) / p>

问题是...如何确保页面始终以UTF-8编码指定?我必须声明HTTP标头和< meta> 标签或只是HTTP标头吗?

解决方案

我无法描述它比: HTML 5之路:字符编码


这是一个7步算法;第4步有两个子步骤,其中第一步有7个分支,其中一个有8个子步骤,其中一个实际链接到一个单独的算法,它本身有7个步骤......它继续这样一段时间。它的要点是




  • 用户覆盖。 - 您没有任何影响

  • Content-Type字段中的HTTPcharset参数。在PHP代码中:

      header('Content-Type:text / html; charset = UTF-8'); 


  • 字节顺序标记HTML文档本身的任何其他数据。 / strong> - 我无法建议实际使用该功能。如果你喜欢,就相应地保存你的文件,但是不要指望头文件()调用的工作完美无缺。另一种方法是手动输出物料清单,在PHP中是:

      echo\xEF\xBB\xBF ; #UTF-8 BOM 

    但即使如此,我也不建议输出物料清单,因为这是一个倒退输出不兼容。这些准则适用于阅读 - 不输出。

  • 带有charset属性的META声明是很好的做法。在HTML 5中:

     < meta charset =UTF-8> 


  • 将http-equiv属性设置为Content类型和为字符集设置的值。 - 为什么不??在HTML 5中,它可能是:

     < meta http-equiv =Content-Typecontent =text / html;字符集= UTF-8\" > 


  • 未指定的启发式分析 - 您对此没有影响。 li>


这些就是要点。我的建议如下:


  • 检查您的网络服务器在提供HTML时是否发送了正确的标题。

  • 使用HTML以及元标记,以便可以将HTML文件保存在磁盘上,稍后在浏览器中打开它(脱机,存档)。 不要将BOM放在里面如果您使用的是UTF-8文档,则不要使用UTF-16或UTF-32。
  • $ b $如果您的系统完全不知道编码,请使用US-ASCII并将其他所有不作为HTML实体的部分屏蔽。

    $ b
    $ b


    注意:这个建议适用于输出到浏览器,而不是用于存储,存储属于您的区域,确保您在处理商店时了解编码。例如,如果您不需要HTML(例如,HTML链接中的& amp; ),那么不要使用HTML实体。



    I'm trying to optimize the way my framework handles HTML5 pages generation. Right now, what I do is to insert a <meta charset="utf-8"/> right after the <head> tag, so it's the first element to be specified (so when I pass the <title> tag and the rest of page elements, it's already defined as being encoded in UTF-8).

    The problem is I'm reading some books on website performance optimizations, and most of them recommends specifying the encoding through a Content-type declaration, rather than inserting a <meta> block.

    The W3C documentation on character encoding detection (section 8.2.2.1) says, essentially, the HTTP headers have priority above any explicit declaration EXCEPT if the user declared an override for the content type through the user agent.

    However, the W3C validator (which is why I use to debug my HTML code output) doesn't complaint but warns me about the absence of the <meta charset="utf-8"/> block, thus encouraging me to put it (it says it's specially recommended if the rendered page is to be saved, which is not the case, but still... it confuses me a bit).

    The question is... how can I ensure the pages are ALWAYS specified as encoded in UTF-8? Must I declare the HTTP header AND the <meta> tag or just the HTTP header?

    解决方案

    I could not describe it better than: The Road to HTML 5: character encoding

    it's a 7-step algorithm; step 4 has 2 sub-steps, the first of which has 7 branches, one of which has 8 sub-steps, one of which actually links to a separate algorithm that itself has 7 steps... It goes on like that for a while. The gist of it is

    • User override. - You have no influence on this
    • An HTTP "charset" parameter in a "Content-Type" field. In PHP code that is:

      header('Content-Type: text/html;charset=UTF-8');
      

    • A Byte Order Mark before any other data in the HTML document itself. - I can not suggest to actually make use of that feature. If you like, just save your files accordingly, but do not expect the header() calls working flawlessly any longer. The alternative is to output the BOM manually, in PHP that is:

      echo "\xEF\xBB\xBF"; # UTF-8 BOM
      

      But even then I can not recommend to output a BOM because this is an backwards incompatible change for the output. These guidelines are for reading - not outputting.

    • A META declaration with a "charset" attribute. - Please do so, this is good practice. In HTML 5 that is:

      <meta charset="UTF-8">
      

    • A META declaration with an "http-equiv" attribute set to "Content-Type" and a value set for "charset". - Why not?! In HTML 5 that would be:

      <meta http-equiv="Content-Type" content="text/html;charset=UTF-8">
      

    • Unspecified heuristic analysis. - You have no influence on this.

    Those are the points. My recommendation are as following:

    • Check your webserver is sending correct headers when serving the HTML.
    • Have your HTML as well those meta-tags so that it's possible to save the HTML file on disk and open it later in a browser (offline, archive).
    • Do not put BOM inside the document if you're using UTF-8.
    • Do not use UTF-16 or UTF-32, if you use Unicode, use UTF-8.

    If you are targetting systems that are totally unaware to encodings, use US-ASCII and mask everything else not part of it as HTML entities.

    Note: This entitites suggestion is for output to the browser and not for storing, storing is something that falls in your area, ensure you are aware about encodings when you handle your store. Never use HTML entities for example when you write HTML into your mysql database when you don't really need it (e.g. &amp; in HTML links).

    这篇关于PHP&amp; HTML5:使用&lt; meta&gt;的UTF-8文档声明标记或通过header()函数?的文章就介绍到这了,希望我们推荐的答案对大家有所帮助,也希望大家多多支持IT屋!

查看全文
登录 关闭
扫码关注1秒登录
发送“验证码”获取 | 15天全站免登陆