charset-utf8和字符实体 [英] charset-utf8 and character entities

查看:62
本文介绍了charset-utf8和字符实体的处理方法,对大家解决问题具有一定的参考价值,需要的朋友们下面随着小编来一起学习吧!

问题描述

我打算将Windows-1252 XHTML网页转换为UTF-8.

I am proposing to convert my windows-1252 XHTML web pages to UTF-8.

我的编码中包含以下字符实体:

I have the following character entities in my coding:

  • ' —撇号
  • ► —右指针
  • ◄ —左指针.
  • ' — apostrophe,
  • ► — right pointer,
  • ◄ — left pointer.

如果我更改了字符集并使用编辑器将页面另存为UTF-8:

If I change the charset and save the pages as UTF-8 using my editor:

  • 撇号保留为字符实体;
  • 将指针转换为代码中的符号(大概是因为UTF-8不支持实体?).

问题:

  1. 如果我正确理解UTF-8,则不需要使用实体,可以直接在代码中键入字符.在哪种情况下,我可以安全地用键入的单引号替换#39 ?

编辑器是否将指针符号直接放置在我的代码中是否正确,这些符号是否可以在现代浏览器中可靠显示,这似乎还可以吗?大概,如果我使用UTF-8,我还是无法还原为实体吗?

Is it correct that the editor has placed the pointer symbols directly into my code and will these be displayed reliably on modern browsers, it seems to be ok? Presumably, I can't revert to the entities anyway, if I use UTF-8?

谢谢.

推荐答案

实体具有三个目的:编码字符无法使用所使用的字符编码进行编码(与UTF-8不相关),而对字符编码则无法进行编码可以方便地在给定的键盘上打字,并且可以对非法未转义的字符进行编码.

Entities have three purposes: Encoding characters it isn't possible to encode in the character encoding used (not relevant with UTF-8), encoding characters it is not convenient to type on a given keyboard, and encoding characters that are illegal unescaped.

► 都应始终产生►.如果没有,那就是其他地方的错误.

► should always produce ► no matter what the encoding. If it doesn't, it's a bug elsewhere.

在UTF-8中是可以的.您可以执行此操作,也可以执行实体操作,这没有任何区别.

directly in the source is fine in UTF-8. You can do either that or the entity, and it makes no difference.

'在大多数情况下都可以,但在某些情况下不行.都允许以下内容:

' is fine in most contexts, but not some. The following are both allowed:

<span title="Jon's example">This is Jon's example</span>

但是必须编码为:

<span title='Jon&#x27;s example'>This is Jon's example</span>

因为否则它将被当作结束属性值的'.

because otherwise it would be taken as the ' that ends the attribute value.

这篇关于charset-utf8和字符实体的文章就介绍到这了,希望我们推荐的答案对大家有所帮助,也希望大家多多支持IT屋!

查看全文
相关文章
登录 关闭
扫码关注1秒登录
发送“验证码”获取 | 15天全站免登陆