使用 XML DOM 打印 XML 文件的内容 [英] Printing content of a XML file using XML DOM

查看:24
本文介绍了使用 XML DOM 打印 XML 文件的内容的处理方法,对大家解决问题具有一定的参考价值,需要的朋友们下面随着小编来一起学习吧!

问题描述

我有一个简单的 XML 文档:

I have a simple XML document:

<?xml version="1.0"?>
<cellphones>
  <telefon>
    <model>Easy DB</model>
    <proizvodjac>Alcatel</proizvodjac>
    <cena>25</cena>
  </telefon>
  <telefon>
    <model>3310</model>
    <proizvodjac>Nokia</proizvodjac>
    <cena>30</cena>
  </telefon>
  <telefon>
    <model>GF768</model>
    <proizvodjac>Ericsson</proizvodjac>
    <cena>15</cena>
  </telefon>
  <telefon>
    <model>Skeleton</model>
    <proizvodjac>Panasonic</proizvodjac>
    <cena>45</cena>
  </telefon>
  <telefon>
    <model>Earl</model>
    <proizvodjac>Sharp</proizvodjac>
    <cena>60</cena>
  </telefon>
</cellphones>

我需要使用 XML DOM 打印这个文件的内容,它的结构需要是这样的:

I need to print the content of this file using XML DOM, and it needs to be structured like this:

"model: Easy DB
proizvodjac: Alcatel
cena: 25"

对于 XML 中的每个节点.

for each node inside the XML.

必须使用 XML DOM 来完成.那就是问题所在.我可以用通常的简单方法来做.但是这个困扰我,因为我似乎无法在互联网上找到任何解决方案.

IT HAS TO BE DONE using XML DOM. That's the problem. I can do it the usual, simple way. But this one bothers me because I can't seem to find any solution on the internet.

这是我所能做的,但我需要访问内部节点(子节点)并获取节点值.我还想摆脱一些突然出现的奇怪字符串#text".

This is as far as I can go, but I need to access inside nodes (child nodes) and to get node values. I also want to get rid of some weird string "#text" that comes up out of the blue.

<?php
    //kreira se DOMDocument objekat
    $xmlDoc = new DOMDocument();

    //u xml objekat se ucitava xml fajl
    $xmlDoc->load("poruke.xml");

    //dodeljuje se promenljivoj koreni element
    $x = $xmlDoc->documentElement;

    //prolazi se kroz petlju tako sto se ispisuje informacija o podelementima
    foreach ($x->childNodes AS $item){
        print $item->nodeName . " = " . $item->nodeValue . "<br />";
    }
?>

谢谢

推荐答案

奇怪的#text 字符串的解释

奇怪的#text 字符串不是突然出现的,而是实际的文本节点.当您加载带有 DOM 任何空白的格式化 XML 文档时,例如默认情况下,缩进、换行符和节点值将作为 DOMText 实例成为 DOM 的一部分,例如

The weird #text strings dont come out of the blue but are actual Text Nodes. When you load a formatted XML document with DOM any whitespace, e.g. indenting, linebreaks and node values will be part of the DOM as DOMText instances by default, e.g.

<cellphones>
	<telefon>
		<model>Easy DB…
E           T   E        T     E      T      

其中 E 是 DOMElement,T 是 DOMText.

where E is a DOMElement and T is a DOMText.

为了解决这个问题,像这样加载文档:

To get around that, load the document like this:

$dom = new DOMDocument;
$dom->preserveWhiteSpace = FALSE;
$dom->load('file.xml');

那么您的文档的结构将如下

Then your document will be structured as follows

<cellphones><telefon><model>Easy DB…
E           E        E      T

请注意,表示 DOMElement 值的各个节点仍将是 DOMText 实例,但控制格式的节点已消失.稍后会详细介绍.

Note that individual nodes representing the value of a DOMElement will still be DOMText instances, but the nodes that control the formatting are gone. More on that later.

证明

您可以使用以下代码轻松测试:

You can test this easily with this code:

$dom = new DOMDocument;
$dom->preserveWhiteSpace = TRUE; // change to FALSE to see the difference
$dom->load('file.xml');
foreach ($dom->getElementsByTagName('telefon') as $telefon) {
    foreach($telefon->childNodes as $node) {
        printf(
            "Name: %s - Type: %s - Value: %s
",
            $node->nodeName,
            $node->nodeType,
            urlencode($node->nodeValue)
        );
    }
}

此代码遍历给定 XML 中的所有 Telefon 元素,并打印出节点名称、类型和其子节点的 urlencoded 节点值.当你保留空白时,你会得到类似

This code runs through all the telefon elements in your given XML and prints out node name, type and the urlencoded node value of it's child nodes. When you preserve the whitespace, you will get something like

Name: #text - Type: 3 - Value: %0A++++
Name: model - Type: 1 - Value: Easy+DB
Name: #text - Type: 3 - Value: %0A++++
Name: proizvodjac - Type: 1 - Value: Alcatel
Name: #text - Type: 3 - Value: %0A++++
Name: cena - Type: 1 - Value: 25
Name: #text - Type: 3 - Value: %0A++
…

我对值进行 urlencoded 的原因是为了表明实际上 DOMText 节点包含您的 DOMDocument 中的缩进和换行符.%0A 是一个换行符,而每个 + 都是一个空格.

The reason I urlencoded the value is to show that there is in fact DOMText nodes containing the indenting and the linebreaks in your DOMDocument. %0A is a linebreak, while each + is a space.

当您将其与您的 XML 进行比较时,您会看到在每个 元素之后都有一个换行符,后跟四个空格,直到 元素开始.同样,在结束的 和开始的 之间只有一个换行符和两个空格.

When you compare this with your XML, you will see there is a line break after each <telefon> element followed by four spaces until the <model> element starts. Likewise, there is only a newline and two spaces between the closing <cena> and the opening <telefon>.

这些节点的给定类型是 3,其中 - 根据列表预定义常量 - 是XML_TEXT_NODE,例如DOMText 节点.由于缺少正确的元素名称,这些节点的名称为 #text.

The given type for these nodes is 3, which - according to the list of predefined constants - is XML_TEXT_NODE, e.g. a DOMText node. In lack of a proper element name, these nodes have a name of #text.

忽略空格

现在,当您禁用保留空格时,上面将输出:

Now, when you disable preservation of whitespace, the above will output:

Name: model - Type: 1 - Value: Easy+DB
Name: proizvodjac - Type: 1 - Value: Alcatel
Name: cena - Type: 1 - Value: 25
Name: model - Type: 1 - Value: 3310
…

如您所见,没有更多的#text 节点,而只有类型 1 的节点,这意味着 XML_ELEMENT_NODE,例如DOMElement.

As you can see, there is no more #text nodes, but only type 1 nodes, which means XML_ELEMENT_NODE, e.g. DOMElement.

DOMElements 包含 DOMText 节点

一开始我说过,DOMElements 的值也是 DOMText 实例.但是在上面的输出中,它们无处可见.那是因为我们正在访问 nodeValue 属性,以字符串形式返回 DOMText 的值.我们可以很容易地证明该值是一个 DOMText:

In the beginning I said, the values of DOMElements are DOMText instances too. But in the output above, they are nowhere to be seen. That's because we are accessing the nodeValue property, which returns the value of the DOMText as string. We can prove that the value is a DOMText easily though:

$dom = new DOMDocument;
$dom->preserveWhiteSpace = FALSE;
$dom->loadXML($xml);
foreach ($dom->getElementsByTagName('telefon') as $telefon) {
    $node = $telefon->firstChild->firstChild; // 1st child of model
    printf(
        "Name: %s - Type: %s - Value: %s
",
        $node->nodeName,
        $node->nodeType,
        urlencode($node->nodeValue)
    );
}

会输出

Name: #text - Type: 3 - Value: Easy+DB
Name: #text - Type: 3 - Value: 3310
Name: #text - Type: 3 - Value: GF768
Name: #text - Type: 3 - Value: Skeleton
Name: #text - Type: 3 - Value: Earl

这证明了 DOMElement 包含它作为 DOMText 的值并且 nodeValue 只是返回 DOMText直接.

And this proves a DOMElement contains it's value as a DOMText and nodeValue is just returning the content of the DOMText directly.

更多关于 nodeValue

事实上,nodeValue 足够智能,可以连接任何 DOMText 子项的内容:

In fact, nodeValue is smart enough to concatenate the contents of any DOMText children:

$dom = new DOMDocument;
$dom->loadXML('<root><p>Hello <em>World</em>!!!</p></root>');
$node = $dom->documentElement->firstChild; // p
printf(
    "Name: %s - Type: %s - Value: %s
",
    $node->nodeName,
    $node->nodeType,
    $node->nodeValue
);

会输出

Name: p - Type: 1 - Value: Hello World!!!

虽然这些确实是

DOMText "Hello"
DOMElement em with DOMText "World"
DOMText "!!!"

使用 XML DOM 打印 XML 文件的内容

要最终回答您的问题,请查看第一个测试代码.你需要的一切都在那里.当然,现在你也得到了其他很好的答案.

To finally answer your question, look at the first test code. Everything you need is in there. And of course by now you have been given fine other answers too.

这篇关于使用 XML DOM 打印 XML 文件的内容的文章就介绍到这了,希望我们推荐的答案对大家有所帮助,也希望大家多多支持IT屋!

查看全文
登录 关闭
扫码关注1秒登录
发送“验证码”获取 | 15天全站免登陆