charset-utf8和字符实体 [英] charset-utf8 and character entities
问题描述
我打算将Windows-1252 XHTML网页转换为UTF-8.
I am proposing to convert my windows-1252 XHTML web pages to UTF-8.
我的编码中包含以下字符实体:
I have the following character entities in my coding:
-
'
—撇号 -
►
—右指针 -
◄
—左指针.
'
— apostrophe,►
— right pointer,◄
— left pointer.
如果我更改了字符集并使用编辑器将页面另存为UTF-8:
If I change the charset and save the pages as UTF-8 using my editor:
- 撇号保留为字符实体;
- 将指针转换为代码中的符号(大概是因为UTF-8不支持实体?).
问题:
-
如果我正确理解UTF-8,则不需要使用实体,可以直接在代码中键入字符.在哪种情况下,我可以安全地用键入的单引号替换
#39
?
编辑器是否将指针符号直接放置在我的代码中是否正确,这些符号是否可以在现代浏览器中可靠显示,这似乎还可以吗?大概,如果我使用UTF-8,我还是无法还原为实体吗?
Is it correct that the editor has placed the pointer symbols directly into my code and will these be displayed reliably on modern browsers, it seems to be ok? Presumably, I can't revert to the entities anyway, if I use UTF-8?
谢谢.
推荐答案
实体具有三个目的:编码字符无法使用所使用的字符编码进行编码(与UTF-8不相关),而对字符编码则无法进行编码可以方便地在给定的键盘上打字,并且可以对非法未转义的字符进行编码.
Entities have three purposes: Encoding characters it isn't possible to encode in the character encoding used (not relevant with UTF-8), encoding characters it is not convenient to type on a given keyboard, and encoding characters that are illegal unescaped.
►
都应始终产生►.如果没有,那就是其他地方的错误.
►
should always produce ► no matter what the encoding. If it doesn't, it's a bug elsewhere.
►
在UTF-8中是可以的.您可以执行此操作,也可以执行实体操作,这没有任何区别.
►
directly in the source is fine in UTF-8. You can do either that or the entity, and it makes no difference.
'在大多数情况下都可以,但在某些情况下不行.都允许以下内容:
' is fine in most contexts, but not some. The following are both allowed:
<span title="Jon's example">This is Jon's example</span>
但是必须编码为:
<span title='Jon's example'>This is Jon's example</span>
因为否则它将被当作结束属性值的'.
because otherwise it would be taken as the ' that ends the attribute value.
这篇关于charset-utf8和字符实体的文章就介绍到这了,希望我们推荐的答案对大家有所帮助,也希望大家多多支持IT屋!