XML声明节点是强制性的吗? [英] Is the XML declaration node mandatory?
问题描述
我和我的一位同事讨论了XML声明节点(我在说这个=> <?xml version="1.0" encoding="UTF-8"?>
).
I had a discussion with a colleague of mine about the XML declaration node (I'm talking about this => <?xml version="1.0" encoding="UTF-8"?>
).
我相信要被称为有效XML",它需要一个XML声明节点.
I believe that for something to be called "valid XML", it requires a XML declaration node.
我的同事指出XML声明节点是可选的,因为默认编码为UTF-8,版本始终为1.0
.这是有道理的,但是该标准怎么说?
My colleague states that the XML declaration node is optionnal, since the default encoding is UTF-8 and the version is always 1.0
. This make sense, but what does the standard says ?
简而言之,给定以下文件:
In short, given the following file:
<books>
<book id="1"><title>Title</title></book>
</book>
我们可以这样说吗?
- 这是有效的XML吗?
- 这是一个有效的XML节点吗?
- 这是有效的XML文档?
非常感谢您.
推荐答案
此:
<?xml version="1.0" encoding="UTF-8"?>
不是处理指令-它是 XML声明.其目的是在开始阅读文档的其余部分之前正确配置XML解析器.
is not a processing instruction - it is the XML declaration. Its purpose is to configure the XML parser correctly before it starts reading the rest of the document.
它看起来像一条处理指令,但是与真正的处理指令不同,它不会成为解析器创建的DOM的一部分.
It looks like a processing instruction, but unlike a real processing instruction it will not be part of the DOM the parser creates.
对于有效" XML,这不是必需的. 有效" 的意思是代表定义良好的文档类型,如DTD或模式中所述" .如果没有模式或DTD,则有效" 毫无意义.
It is not necessary for "valid" XML. "Valid" means "represents a well-defined document type, as described in a DTD or a schema". Without a schema or DTD the word "valid" has no meaning.
许多人误以为格式正确" 时会误用有效" .格式正确的XML文档必须遵守XML的基本语法规则.
Many people mis-use "valid" when they really mean "well-formed". A well-formed XML document is one that obeys the basic syntax rules of XML.
因为格式version
和encoding
(分别为1.0
和UTF-8
/UTF-16
)都有默认值,所以也不需要XML声明来形成格式正确的文档.如果文件中存在Unicode BOM(字节顺序标记),它将确定编码.如果没有BOM和XML声明,则假定为UTF-8.
There is no XML declaration necessary for a document to be well-formed, either, since there are defaults for both version
and encoding
(1.0
and UTF-8
/UTF-16
, respectively). If a Unicode BOM (Byte Order Mark) is present in the file, it determines the encoding. If there is no BOM and no XML declaration, UTF-8 is assumed.
这是有关XML文件中编码声明和检测如何工作的规范线程. 默认编码的默认编码方式(UTF -8)中的XML声明?
Here is a canonical thread on how encoding declaration and detection works in XML files. How default is the default encoding (UTF-8) in the XML Declaration?
对您的问题:
- 这是有效的XML吗?
没有DTD或模式,就无法回答.不过,它的格式正确. - 这是一个有效的XML节点吗?
节点是一个与文档(DOM)的内存表示有关的概念.该代码段格式正确,因此可以解析为一个节点. - 这是有效的XML文档?
参见#1.
- It is valid XML ?
This cannot be answered without a DTD or a schema. It is well-formed, though. - It is a valid XML node ?
A node is a concept that is related to an in-memory representation of a document (a DOM). This snippet can be parsed into a node, since it is well-formed. - It is a valid XML document ?
See #1.
您在这里混淆了一些XML概念(不用担心,这种混淆很常见,部分原因是这些概念重叠并且名称经常被滥用).
You are confusing a few XML concepts here (not to worry, this confusion is common and stems partly from the fact that the concepts overlap and names are mis-used rather often).
- 这一切都始于结构化数据,该结构化数据由名称,值和以树状组织的属性组成.
- XML 基本上是指一种以文本形式表示此结构化数据的语法(它是一种标记语言").这是将树序列化到一串字符后得到的结果,它可以用于再次将一串字符反序列化到一棵树中.
- 文档通常是指代表序列化树的字符串.它可以存储在文件中,通过网络发送或在内存中创建.
- 非常严格地定义了序列化和反序列化的规则.可以成功反序列化为树的文档(字符串")被称为格式正确. 可以在所谓的DTD或模式中定义这种树的
- 语义(允许的元素,元素数量和顺序,名称空间,任何数量的复杂规则).如果一棵树遵循一组明确定义的语义,则该树被称为有效.
- 术语文档对象模型(DOM)是指结构化数据的标准化内存表示形式.这是定义良好的API的名称,可使用标准化方法访问此树.
- 节点是文档对象模型的基本数据结构.
- It all starts with structured data consisting of names, values and attributes that is organized as a tree.
- XML means, most basically, a syntax to represent this structured data in textual form (it's a "Markup Language"). It is what you get when you serialize the tree into a string of characters and it can be used to de-serialize a string of characters into a tree again.
- Document usually refers to a string of characters that represent a serialized tree. It can be stored in a file, sent over the network or created in-memory.
- The rules of serialization and de-serialization are very strictly defined. A document (a "string of characters") that can successfully be de-serialized into a tree is said to be well-formed.
- The semantics of such a tree (allowed elements, element count and order, namespaces, any number of complex rules, really) can be defined in what is called a DTD or a schema. If a tree obeys a certain set of well-defined semantics, it is said to be valid.
- The term Document Object Model (DOM) refers to the standardized in-memory representation of structured data. It's the name of the a well-defined API to access this tree with standardized methods.
- A node is the basic data structure of a Document Object Model.
这篇关于XML声明节点是强制性的吗?的文章就介绍到这了,希望我们推荐的答案对大家有所帮助,也希望大家多多支持IT屋!