使用 XML DOM 打印 XML 文件的内容 [英] Printing content of a XML file using XML DOM
问题描述
我有一个简单的 XML 文档:
I have a simple XML document:
<?xml version="1.0"?>
<cellphones>
<telefon>
<model>Easy DB</model>
<proizvodjac>Alcatel</proizvodjac>
<cena>25</cena>
</telefon>
<telefon>
<model>3310</model>
<proizvodjac>Nokia</proizvodjac>
<cena>30</cena>
</telefon>
<telefon>
<model>GF768</model>
<proizvodjac>Ericsson</proizvodjac>
<cena>15</cena>
</telefon>
<telefon>
<model>Skeleton</model>
<proizvodjac>Panasonic</proizvodjac>
<cena>45</cena>
</telefon>
<telefon>
<model>Earl</model>
<proizvodjac>Sharp</proizvodjac>
<cena>60</cena>
</telefon>
</cellphones>
我需要使用 XML DOM 打印这个文件的内容,它的结构需要是这样的:
I need to print the content of this file using XML DOM, and it needs to be structured like this:
"model: Easy DB
proizvodjac: Alcatel
cena: 25"
对于 XML 中的每个节点.
for each node inside the XML.
必须使用 XML DOM 来完成.那就是问题所在.我可以用通常的简单方法来做.但是这个困扰我,因为我似乎无法在互联网上找到任何解决方案.
IT HAS TO BE DONE using XML DOM. That's the problem. I can do it the usual, simple way. But this one bothers me because I can't seem to find any solution on the internet.
这是我所能做的,但我需要访问内部节点(子节点)并获取节点值.我还想摆脱一些突然出现的奇怪字符串#text".
This is as far as I can go, but I need to access inside nodes (child nodes) and to get node values. I also want to get rid of some weird string "#text" that comes up out of the blue.
<?php
//kreira se DOMDocument objekat
$xmlDoc = new DOMDocument();
//u xml objekat se ucitava xml fajl
$xmlDoc->load("poruke.xml");
//dodeljuje se promenljivoj koreni element
$x = $xmlDoc->documentElement;
//prolazi se kroz petlju tako sto se ispisuje informacija o podelementima
foreach ($x->childNodes AS $item){
print $item->nodeName . " = " . $item->nodeValue . "<br />";
}
?>
谢谢
推荐答案
奇怪的#text 字符串的解释
奇怪的#text 字符串不是突然出现的,而是实际的文本节点.当您加载带有 DOM
任何空白的格式化 XML 文档时,例如默认情况下,缩进、换行符和节点值将作为 DOMText
实例成为 DOM 的一部分,例如
The weird #text strings dont come out of the blue but are actual Text Nodes. When you load a formatted XML document with DOM
any whitespace, e.g. indenting, linebreaks and node values will be part of the DOM as DOMText
instances by default, e.g.
<cellphones>
<telefon>
<model>Easy DB…
E T E T E T
其中 E 是 DOMElement
,T 是 DOMText
.
where E is a DOMElement
and T is a DOMText
.
为了解决这个问题,像这样加载文档:
To get around that, load the document like this:
$dom = new DOMDocument;
$dom->preserveWhiteSpace = FALSE;
$dom->load('file.xml');
那么您的文档的结构将如下
Then your document will be structured as follows
<cellphones><telefon><model>Easy DB…
E E E T
请注意,表示 DOMElement
值的各个节点仍将是 DOMText
实例,但控制格式的节点已消失.稍后会详细介绍.
Note that individual nodes representing the value of a DOMElement
will still be DOMText
instances, but the nodes that control the formatting are gone. More on that later.
证明
您可以使用以下代码轻松测试:
You can test this easily with this code:
$dom = new DOMDocument;
$dom->preserveWhiteSpace = TRUE; // change to FALSE to see the difference
$dom->load('file.xml');
foreach ($dom->getElementsByTagName('telefon') as $telefon) {
foreach($telefon->childNodes as $node) {
printf(
"Name: %s - Type: %s - Value: %s
",
$node->nodeName,
$node->nodeType,
urlencode($node->nodeValue)
);
}
}
此代码遍历给定 XML 中的所有 Telefon 元素,并打印出节点名称、类型和其子节点的 urlencoded 节点值.当你保留空白时,你会得到类似
This code runs through all the telefon elements in your given XML and prints out node name, type and the urlencoded node value of it's child nodes. When you preserve the whitespace, you will get something like
Name: #text - Type: 3 - Value: %0A++++
Name: model - Type: 1 - Value: Easy+DB
Name: #text - Type: 3 - Value: %0A++++
Name: proizvodjac - Type: 1 - Value: Alcatel
Name: #text - Type: 3 - Value: %0A++++
Name: cena - Type: 1 - Value: 25
Name: #text - Type: 3 - Value: %0A++
…
我对值进行 urlencoded 的原因是为了表明实际上 DOMText
节点包含您的 DOMDocument
中的缩进和换行符.%0A
是一个换行符,而每个 +
都是一个空格.
The reason I urlencoded the value is to show that there is in fact DOMText
nodes containing the indenting and the linebreaks in your DOMDocument
. %0A
is a linebreak, while each +
is a space.
当您将其与您的 XML 进行比较时,您会看到在每个
元素之后都有一个换行符,后跟四个空格,直到
元素开始.同样,在结束的
和开始的
之间只有一个换行符和两个空格.
When you compare this with your XML, you will see there is a line break after each <telefon>
element followed by four spaces until the <model>
element starts. Likewise, there is only a newline and two spaces between the closing <cena>
and the opening <telefon>
.
这些节点的给定类型是 3,其中 - 根据列表预定义常量 - 是XML_TEXT_NODE
,例如DOMText
节点.由于缺少正确的元素名称,这些节点的名称为 #text.
The given type for these nodes is 3, which - according to the list of predefined constants - is XML_TEXT_NODE
, e.g. a DOMText
node. In lack of a proper element name, these nodes have a name of #text.
忽略空格
现在,当您禁用保留空格时,上面将输出:
Now, when you disable preservation of whitespace, the above will output:
Name: model - Type: 1 - Value: Easy+DB
Name: proizvodjac - Type: 1 - Value: Alcatel
Name: cena - Type: 1 - Value: 25
Name: model - Type: 1 - Value: 3310
…
如您所见,没有更多的#text 节点,而只有类型 1 的节点,这意味着 XML_ELEMENT_NODE
,例如DOMElement
.
As you can see, there is no more #text nodes, but only type 1 nodes, which means XML_ELEMENT_NODE
, e.g. DOMElement
.
DOMElements 包含 DOMText 节点
一开始我说过,DOMElements
的值也是 DOMText
实例.但是在上面的输出中,它们无处可见.那是因为我们正在访问 nodeValue
属性,以字符串形式返回 DOMText
的值.我们可以很容易地证明该值是一个 DOMText
:
In the beginning I said, the values of DOMElements
are DOMText
instances too. But in the output above, they are nowhere to be seen. That's because we are accessing the nodeValue
property, which returns the value of the DOMText
as string. We can prove that the value is a DOMText
easily though:
$dom = new DOMDocument;
$dom->preserveWhiteSpace = FALSE;
$dom->loadXML($xml);
foreach ($dom->getElementsByTagName('telefon') as $telefon) {
$node = $telefon->firstChild->firstChild; // 1st child of model
printf(
"Name: %s - Type: %s - Value: %s
",
$node->nodeName,
$node->nodeType,
urlencode($node->nodeValue)
);
}
会输出
Name: #text - Type: 3 - Value: Easy+DB
Name: #text - Type: 3 - Value: 3310
Name: #text - Type: 3 - Value: GF768
Name: #text - Type: 3 - Value: Skeleton
Name: #text - Type: 3 - Value: Earl
这证明了 DOMElement
包含它作为 DOMText
的值并且 nodeValue
只是返回 DOMText的内容代码>直接.
And this proves a DOMElement
contains it's value as a DOMText
and nodeValue
is just returning the content of the DOMText
directly.
更多关于 nodeValue
事实上,nodeValue
足够智能,可以连接任何 DOMText
子项的内容:
In fact, nodeValue
is smart enough to concatenate the contents of any DOMText
children:
$dom = new DOMDocument;
$dom->loadXML('<root><p>Hello <em>World</em>!!!</p></root>');
$node = $dom->documentElement->firstChild; // p
printf(
"Name: %s - Type: %s - Value: %s
",
$node->nodeName,
$node->nodeType,
$node->nodeValue
);
会输出
Name: p - Type: 1 - Value: Hello World!!!
虽然这些确实是
DOMText "Hello"
DOMElement em with DOMText "World"
DOMText "!!!"
使用 XML DOM 打印 XML 文件的内容
要最终回答您的问题,请查看第一个测试代码.你需要的一切都在那里.当然,现在你也得到了其他很好的答案.
To finally answer your question, look at the first test code. Everything you need is in there. And of course by now you have been given fine other answers too.
这篇关于使用 XML DOM 打印 XML 文件的内容的文章就介绍到这了,希望我们推荐的答案对大家有所帮助,也希望大家多多支持IT屋!