XML声明节点是强制性的吗? [英] Is the XML declaration node mandatory?

查看:141
本文介绍了XML声明节点是强制性的吗?的处理方法,对大家解决问题具有一定的参考价值,需要的朋友们下面随着小编来一起学习吧!

问题描述

我和我的一位同事讨论了XML声明节点(我在说这个=> <?xml version="1.0" encoding="UTF-8"?>).

I had a discussion with a colleague of mine about the XML declaration node (I'm talking about this => <?xml version="1.0" encoding="UTF-8"?>).

我相信要被称为有效XML",它需要一个XML声明节点.

I believe that for something to be called "valid XML", it requires a XML declaration node.

我的同事指出XML声明节点是可选的,因为默认编码为UTF-8,版本始终为1.0.这是有道理的,但是该标准怎么说?

My colleague states that the XML declaration node is optionnal, since the default encoding is UTF-8 and the version is always 1.0. This make sense, but what does the standard says ?

简而言之,给定以下文件:

In short, given the following file:

<books>
  <book id="1"><title>Title</title></book>
</book>

我们可以这样说吗?

  1. 这是有效的XML吗?
  2. 这是一个有效的XML节点吗?
  3. 这是有效的XML文档?

非常感谢您.

推荐答案

此:

<?xml version="1.0" encoding="UTF-8"?>

不是处理指令-它是 XML声明.其目的是在开始阅读文档的其余部分之前正确配置XML解析器.

is not a processing instruction - it is the XML declaration. Its purpose is to configure the XML parser correctly before it starts reading the rest of the document.

它看起来像一条处理指令,但是与真正的处理指令不同,它不会成为解析器创建的DOM的一部分.

It looks like a processing instruction, but unlike a real processing instruction it will not be part of the DOM the parser creates.

对于有效" XML,这不是必需的. 有效" 的意思是代表定义良好的文档类型,如DTD或模式中所述" .如果没有模式或DTD,则有效" 毫无意义.

It is not necessary for "valid" XML. "Valid" means "represents a well-defined document type, as described in a DTD or a schema". Without a schema or DTD the word "valid" has no meaning.

许多人误以为格式正确" 时会误用有效" .格式正确的XML文档必须遵守XML的基本语法规则.

Many people mis-use "valid" when they really mean "well-formed". A well-formed XML document is one that obeys the basic syntax rules of XML.

因为格式versionencoding(分别为1.0UTF-8/UTF-16)都有默认值,所以也不需要XML声明来形成格式正确的文档.如果文件中存在Unicode BOM(字节顺序标记),它将确定编码.如果没有BOM和XML声明,则假定为UTF-8.

There is no XML declaration necessary for a document to be well-formed, either, since there are defaults for both version and encoding (1.0 and UTF-8/UTF-16, respectively). If a Unicode BOM (Byte Order Mark) is present in the file, it determines the encoding. If there is no BOM and no XML declaration, UTF-8 is assumed.

这是有关XML文件中编码声明和检测如何工作的规范线程. 默认编码的默认编码方式(UTF -8)中的XML声明?

Here is a canonical thread on how encoding declaration and detection works in XML files. How default is the default encoding (UTF-8) in the XML Declaration?

对您的问题:

  1. 这是有效的XML吗?
    没有DTD或模式,就无法回答.不过,它的格式正确.
  2. 这是一个有效的XML节点吗?
    节点是一个与文档(DOM)的内存表示有关的概念.该代码段格式正确,因此可以解析为一个节点.
  3. 这是有效的XML文档?
    参见#1.
  1. It is valid XML ?
    This cannot be answered without a DTD or a schema. It is well-formed, though.
  2. It is a valid XML node ?
    A node is a concept that is related to an in-memory representation of a document (a DOM). This snippet can be parsed into a node, since it is well-formed.
  3. It is a valid XML document ?
    See #1.

您在这里混淆了一些XML概念(不用担心,这种混淆很常见,部分原因是这些概念重叠并且名称经常被滥用).

You are confusing a few XML concepts here (not to worry, this confusion is common and stems partly from the fact that the concepts overlap and names are mis-used rather often).

  • 这一切都始于结构化数据,该结构化数据由名称,值和以树状组织的属性组成.
  • XML 基本上是指一种以文本形式表示此结构化数据的语法(它是一种标记语言").这是将树序列化到一串字符后得到的结果,它可以用于再次将一串字符反序列化到一棵树中.
  • 文档通常是指代表序列化树的字符串.它可以存储在文件中,通过网络发送或在内存中创建.
  • 非常严格地定义了序列化和反序列化的规则.可以成功反序列化为树的文档(字符串")被称为格式正确.
  • 可以在所谓的DTD或模式中定义这种树的
  • 语义(允许的元素,元素数量和顺序,名称空间,任何数量的复杂规则).如果一棵树遵循一组明确定义的语义,则该树被称为有效.
  • 术语文档对象模型(DOM)是指结构化数据的标准化内存表示形式.这是定义良好的API的名称,可使用标准化方法访问此树.
  • 节点是文档对象模型的基本数据结构.
  • It all starts with structured data consisting of names, values and attributes that is organized as a tree.
  • XML means, most basically, a syntax to represent this structured data in textual form (it's a "Markup Language"). It is what you get when you serialize the tree into a string of characters and it can be used to de-serialize a string of characters into a tree again.
  • Document usually refers to a string of characters that represent a serialized tree. It can be stored in a file, sent over the network or created in-memory.
  • The rules of serialization and de-serialization are very strictly defined. A document (a "string of characters") that can successfully be de-serialized into a tree is said to be well-formed.
  • The semantics of such a tree (allowed elements, element count and order, namespaces, any number of complex rules, really) can be defined in what is called a DTD or a schema. If a tree obeys a certain set of well-defined semantics, it is said to be valid.
  • The term Document Object Model (DOM) refers to the standardized in-memory representation of structured data. It's the name of the a well-defined API to access this tree with standardized methods.
  • A node is the basic data structure of a Document Object Model.

这篇关于XML声明节点是强制性的吗?的文章就介绍到这了,希望我们推荐的答案对大家有所帮助,也希望大家多多支持IT屋!

查看全文
登录 关闭
扫码关注1秒登录
发送“验证码”获取 | 15天全站免登陆