XML 标头中的“编码"有什么用? [英] What use is the 'encoding' in the XML header?

查看:22
本文介绍了XML 标头中的“编码"有什么用?的处理方法,对大家解决问题具有一定的参考价值,需要的朋友们下面随着小编来一起学习吧!

问题描述

查看 XML 标头

<?xml version="1.0" encoding="UTF-16" standalone="no"?>

我说 encoding 属性是

  • 来得太晚了(除非您知道编码,否则您无法正确阅读...)
  • 冗余,因此容易出错:用Big5"替换它太容易了,但以 UTF-8 格式保存文件

或者该属性与流的内容无关?

Or is that attribute not about the content of the stream?

我在这里搞混了吗?

推荐答案

正如您提到的,您必须知道文件的编码才能读取 encoding 属性.

As you mentioned, you'd have to know the encoding of the file to read the encoding attribute.

但是,有一种启发式方法可以轻松地让您足够接近真实"编码,从而允许您读取编码属性.这是有效的,因为 <?xml 部分根据定义只能包含 ASCII 范围内的字符(无论它们是如何编码的).

However, there is a heuristic that can easily get you close enough to the "real" encoding to allow you to read the encoding attribute. This works, because the <?xml part by definition can only contain characters in the ASCII range (however they are encoded).

XML 标准甚至描述了用于找出编码的确切过程.

而且编码标签也不是多余的.例如,如果您使用 XML 规范中的算法找出使用了某些基于 ASCII(或 ASCII 兼容)的编码,您仍然需要阅读编码以找出实际使用的编码使用(有效的候选将是 ASCII、UTF-8、任何 ISO-8859-* 编码,任何 Windows-* 编码,KOI8-R 以及许多其他).对于 <?xml 部分本身,它不会对它是哪个部分产生影响,但对于文档的其余部分,它可能会产生巨大的差异.

And the encoding label isn't redundant either. For example, if you use the algorithm in the XML spec to find out that some ASCII-based (or ASCII-compatible) encoding is used you still need to read the encoding to find out which one is actually use (valid candidates would be ASCII, UTF-8, any of the ISO-8859-* encodings, any of the Windows-* encodings, KOI8-R and many, many others). For the <?xml part itself it won't make a difference which one it is, but for the rest of the document, it can make a huge difference.

关于错误标记的 XML 文件:是的,生成这些文件很容易,但是:XML 规范明确指出这些文件格式错误,因此不是正确的 XML.不正确的编码必须报告为错误(只要它们可以被检测到!).所以这是生成 XML 的人的问题.

Regarding mis-labeled XML files: yes, it's easy to produce those, however: the XML spec clearly specifies that those files are mal-formed and as such are not correct XML. Incorrect encodings must be reported as an error (as long as they can be detected!). So it's the problem of whoever is producing the XML.

这篇关于XML 标头中的“编码"有什么用?的文章就介绍到这了,希望我们推荐的答案对大家有所帮助,也希望大家多多支持IT屋!

查看全文
登录 关闭
扫码关注1秒登录
发送“验证码”获取 | 15天全站免登陆