为什么HTML实体名称的dec< 255不需要分号? [英] Why do HTML entity names with dec < 255 not require semicolon?

查看:287
本文介绍了为什么HTML实体名称的dec< 255不需要分号?的处理方法,对大家解决问题具有一定的参考价值,需要的朋友们下面随着小编来一起学习吧!

问题描述

在纯HTML文档中,& pound (dec 163)呈现为 ,而不需要; code>,而& oelig (dec 339)只会用分号表示 - 。似乎每个具有小于255的小数值的html实体都将在FireFox和Chrome中都不需要分号的情况下呈现。



什么给出?

解决方案

原因在于,历史上,当实体引用(或字符引用)没有紧跟在后面时,分号是可选的一个名字。所以& pound?是可以的,因为不是名称字符(即名称中允许的字符),但是& pound4 不是,因为 4 是一个名字,使得 pound4 实体名称(在HTML中未定义,但可能在某天定义)。这个规则是HTML中SGML传统的一部分,它是浏览器实际应用SGML特色的少数几个事情之一。

然而,它总是被认为是一种很好的习惯用分号终止实体引用。 XML和XHTML使它成为了正式的必需。



这就是为什么目前的浏览器做法允许像经典HTML那样忽略分号,但仅限于有限集表示ISO Latin 1字符的字符引用,即Unicode号小于256的字符(十六进制FF)。这是实体引用的原始集合,因此这样的引用已广泛地使用而没有分号。所以这些做法是一种妥协:他们希望鼓励使用可推荐的表示法,但不会使大量旧页面失效,更不用说使浏览器无法正确呈现它们。



HTML5草案在这方面有不同的立场,但例如即使在HTML语法中,2013年8月6日起的HTML5 CR也需要所有情况下的分号。缺少分号被定义为解析错误,这意味着错误处理是定义良好的(实体应该被识别),但是浏览器在第一次解析错误时仍然可能停止解析!


In a plain HTML document &pound (dec 163) renders as £ without needing the ;, whereas &oelig (dec 339) will only render a œ with the semicolon. It seems that every html entity with a decimal value under 255 will render without needing the semicolon, both in FireFox and Chrome.

What gives?

解决方案

The reason is that historically the semicolon has been optional when an entity reference (or a character reference) is not immediately followed by a name character. So &pound? is OK since ? is not a name character (i.e., a character allowed in names), but &pound4 is not, since 4 is a name character, making pound4 the entity name (which is undefined in HTML, but might become defined some day). This rule is part of SGML legacy in HTML, one of the few things where browsers actually applied specialties of SGML.

It has, however, always been regarded as good practice to terminate entity references by a semicolon. XML, and hence XHTML, makes it even formally mandatory.

This is why current browser practices allow omission of semicolons as in "classic" HTML, but only for the limited set of character references denoting ISO Latin 1 characters, i.e. characters with Unicode number less than 256 in decimal (FF in hexadecimal). This was the original set of entity references, and therefore such references have widely been used without semicolon. So the practices are a compromise: they want to encourage into using the recommendable notation but not invalidate a bulk of old pages, still less to make browsers fail to render them properly.

The HTML5 drafts have had various positions on this, but e.g. HTML5 CR from 6 August 2013 requires the semicolon in all cases even in HTML syntax. Lack of semicolon is defined as a parse error, which means that error handling is well-defined (the entity shall be recognized), but browsers may still stop parsing at first parse error!

这篇关于为什么HTML实体名称的dec< 255不需要分号?的文章就介绍到这了,希望我们推荐的答案对大家有所帮助,也希望大家多多支持IT屋!

查看全文
登录 关闭
扫码关注1秒登录
发送“验证码”获取 | 15天全站免登陆