使用XML DOM打印XML文件的内容 [英] Printing content of a XML file using XML DOM

查看:91
本文介绍了使用XML DOM打印XML文件的内容的处理方法,对大家解决问题具有一定的参考价值,需要的朋友们下面随着小编来一起学习吧!

问题描述

我有一个简单的XML文档:

 <?xml version =1.0?> 
<手机>
< telefon>
< model> Easy DB< / model>
< proizvodjac>阿尔卡特< / proizvodjac>
< cena> 25< / cena>
< / telefon>
< telefon>
< model> 3310< / model>
< proizvodjac>诺基亚< / proizvodjac>
< cena> 30< / cena>
< / telefon>
< telefon>
< model> GF768< / model>
< proizvodjac>爱立信< / proizvodjac>
< cena> 15< / cena>
< / telefon>
< telefon>
< model> Skeleton< / model>
< proizvodjac> Panasonic< / proizvodjac>
< cena> 45< / cena>
< / telefon>
< telefon>
< model> Earl< / model>
< proizvodjac> Sharp< / proizvodjac>
< cena> 60< / cena>
< / telefon>
< / cellphones>

我需要使用XML DOM打印此文件的内容,并且需要像这样构造:$ /

$ $ $ $ $ $ $ $ $ $ $ $ $ $ $ $ $ $ $ $ $ $ $¬$¬<<<<<<<<<<<<<<<<<<<<<<<<

对于XML中的每个节点。



完成使用XML DOM。那就是问题所在。我可以做到平常,简单的方法。但是这个打扰了我,因为我似乎无法在互联网上找到任何解决方案。



这是尽可能远的,但是我需要访问节点(子节点)并获取节点值。我还想摆脱蓝色中出现的一些奇怪的字符串#text。

 <?php 
// kreira se DOMDocument objekat
$ xmlDoc = new DOMDocument();

// u xml objekat se ucitava xml fajl
$ xmlDoc-> load(poruke.xml);

// dodeljuje se promenljivoj koreni元素
$ x = $ xmlDoc-> documentElement;

// prolazi se kroz petlju tako sto se ispisuje informacija o podelementima
foreach($ x-> childNodes AS $ item){
print $ item-> nodeName。 =。 $ item-> nodeValue。 < br />;
}
?>

谢谢

解决方案

对于奇怪的#text字符串的解释



奇怪的#text字符串不会是蓝色的,而是实际的文本节点。当您加载格式化的XML文档时,请使用 DOM 任何空格,例如默认情况下,缩进,换行符和节点值将作为 DOMText 实例的DOM的一部分,例如

 < phonephones> \\\
\t< telefon> \\\
\t\t< model> Easy DB ...
ETETET

其中E是一个 DOMElement ,而T是一个 DOMText



要解决这个问题,请加载文档:

 code> $ dom = new DOMDocument; 
$ dom-> preserveWhiteSpace = FALSE;
$ dom-> load('file.xml');

然后您的文档将按如下方式进行结构:

 <手机>< telefon><模型> Easy DB ... 
EEET

请注意,表示 DOMElement 的值的各个节点仍将是 DOMText 实例,但控制格式化的节点已经消失。



证明



您可以轻松地测试这个代码:

  $ dom = new DOMDocument; 
$ dom-> preserveWhiteSpace = TRUE; //更改为FALSE以查看差异
$ dom-> load('file.xml');
foreach($ dom-> getElementsByTagName('telefon')as $ telefon){
foreach($ telefon-> childNodes as $ node){
printf(
名称:%s - 类型:%s - 值:%s\\\

$ node-> nodeName,
$ node-> nodeType,
urlencode($ node- > nodeValue)
);
}
}

此代码运行在您给定的所有电话元素XML并打印出它的子节点的节点名称,类型和urlencoded节点值。当您保留空格时,您将获得类似于

 名称:#text  - 类型:3  - 值:%0A ++++ 
名称:模型 - 类型:1 - 值:Easy + DB
名称:#text - 类型:3 - 值:%0A ++++
名称:proizvodjac - 类型:1 - 值:阿尔卡特
名称:#text - 类型:3 - 值:%0A ++++
名称:cena - 类型:1 - 值:25
名称:#text - 类型:3 - 值:%0A ++
...

我用urlencoded的价值的原因是显示事实上 DOMText 包含 DOMDocument 中的缩进和换行符的节点。 %0A 是一个换行符,而每个 + 是一个空格。



当您将其与XML进行比较时,您将看到每个< telefon> 之后有一个换行符元素后跟四个空格,直到< model> 元素开始。同样地,关闭< cena> 和开始< telefon> 之间只有一个换行符和两个空格。



这些节点的给定类型为3,其中 - 根据预定义常量列表 - 是 XML_TEXT_NODE ,例如一个 DOMText 节点。由于缺少正确的元素名称,这些节点的名称为#text。



忽略空格



现在,当您禁用保留空白时,上述将输出:

 名称:模型 - 类型: 1  - 价值:Easy + DB 
名称:proizvodjac - 类型:1 - 价值:阿尔卡特
名称:cena - 类型:1 - 价值:25
名称:型号 - 类型:1 - :3310
...

如你所见,没有更多的#text节点,只能输入1个节点,这意味着 XML_ELEMENT_NODE ,例如一个DOMElement



DOMElements包含DOMText节点



起初我说, DOMElements 也是 DOMText 实例。但在上面的输出中,他们无处可见。这是因为我们正在访问 nodeValue 属性,返回值为 DOMText 作为字符串。我们可以很容易地证明这个价值是一个 DOMText

  $ dom = new DOMDocument; 
$ dom-> preserveWhiteSpace = FALSE;
$ dom-> loadXML($ xml);
foreach($ dom-> getElementsByTagName('telefon')as $ telefon){
$ node = $ telefon-> firstChild-> firstChild; //模型的第一个孩子
printf(
名称:%s - 类型:%s - 值:%s\\\

$ node-> nodeName,
$ node-> nodeType,
urlencode($ node-> nodeValue)
);
}

将输出

 名称:#text  - 类型:3  - 值:Easy + DB 
名称:#text - 类型:3 - 值:3310
名称:#text - 类型:3 - 值:GF768
名称:#text - 类型:3 - 值:Skeleton
名称:#text - 类型:3 - 值:Earl
/ pre>

这证明了一个 DOMElement 包含它的值作为 DOMText nodeValue 只是直接返回 DOMText 的内容。



nodeValue上的更多内容



其实, nodeValue 足够聪明地连接任何 DOMText children的内容:

 $ code $ $ dom = new DOMDocument; 
$ dom-> loadXML('< root>< p> Hello< em> World< / em> !!!< / p>< / root>');
$ node = $ dom-> documentElement-> firstChild; // p
printf(
Name:%s - Type:%s - Value:%s\\\

$ node-> nodeName,
$ node - > nodeType,
$ node-> nodeValue
);

将输出

 code>名称:p  - 类型:1  - 价值:Hello World! 

虽然这些确实是

  DOMTextHello
DOMElement em与DOMTextWorld
DOMText!!!

使用XML DOM打印XML文件的内容



最后回答你的问题,看看第一个测试代码。你需要的一切都在那里。当然,现在你也被给了很好的其他答案。


I have a simple XML document:

<?xml version="1.0"?>
<cellphones>
  <telefon>
    <model>Easy DB</model>
    <proizvodjac>Alcatel</proizvodjac>
    <cena>25</cena>
  </telefon>
  <telefon>
    <model>3310</model>
    <proizvodjac>Nokia</proizvodjac>
    <cena>30</cena>
  </telefon>
  <telefon>
    <model>GF768</model>
    <proizvodjac>Ericsson</proizvodjac>
    <cena>15</cena>
  </telefon>
  <telefon>
    <model>Skeleton</model>
    <proizvodjac>Panasonic</proizvodjac>
    <cena>45</cena>
  </telefon>
  <telefon>
    <model>Earl</model>
    <proizvodjac>Sharp</proizvodjac>
    <cena>60</cena>
  </telefon>
</cellphones>

I need to print the content of this file using XML DOM, and it needs to be structured like this:

"model: Easy DB
proizvodjac: Alcatel
cena: 25"

for each node inside the XML.

IT HAS TO BE DONE using XML DOM. That's the problem. I can do it the usual, simple way. But this one bothers me because I can't seem to find any solution on the internet.

This is as far as I can go, but I need to access inside nodes (child nodes) and to get node values. I also want to get rid of some weird string "#text" that comes up out of the blue.

<?php
    //kreira se DOMDocument objekat
    $xmlDoc = new DOMDocument();

    //u xml objekat se ucitava xml fajl
    $xmlDoc->load("poruke.xml");

    //dodeljuje se promenljivoj koreni element
    $x = $xmlDoc->documentElement;

    //prolazi se kroz petlju tako sto se ispisuje informacija o podelementima
    foreach ($x->childNodes AS $item){
        print $item->nodeName . " = " . $item->nodeValue . "<br />";
    }
?>

Thanks

解决方案

Explanation for weird #text strings

The weird #text strings dont come out of the blue but are actual Text Nodes. When you load a formatted XML document with DOM any whitespace, e.g. indenting, linebreaks and node values will be part of the DOM as DOMText instances by default, e.g.

<cellphones>\n\t<telefon>\n\t\t<model>Easy DB…
E           T   E        T     E      T      

where E is a DOMElement and T is a DOMText.

To get around that, load the document like this:

$dom = new DOMDocument;
$dom->preserveWhiteSpace = FALSE;
$dom->load('file.xml');

Then your document will be structured as follows

<cellphones><telefon><model>Easy DB…
E           E        E      T

Note that individual nodes representing the value of a DOMElement will still be DOMText instances, but the nodes that control the formatting are gone. More on that later.

Proof

You can test this easily with this code:

$dom = new DOMDocument;
$dom->preserveWhiteSpace = TRUE; // change to FALSE to see the difference
$dom->load('file.xml');
foreach ($dom->getElementsByTagName('telefon') as $telefon) {
    foreach($telefon->childNodes as $node) {
        printf(
            "Name: %s - Type: %s - Value: %s\n",
            $node->nodeName,
            $node->nodeType,
            urlencode($node->nodeValue)
        );
    }
}

This code runs through all the telefon elements in your given XML and prints out node name, type and the urlencoded node value of it's child nodes. When you preserve the whitespace, you will get something like

Name: #text - Type: 3 - Value: %0A++++
Name: model - Type: 1 - Value: Easy+DB
Name: #text - Type: 3 - Value: %0A++++
Name: proizvodjac - Type: 1 - Value: Alcatel
Name: #text - Type: 3 - Value: %0A++++
Name: cena - Type: 1 - Value: 25
Name: #text - Type: 3 - Value: %0A++
…

The reason I urlencoded the value is to show that there is in fact DOMText nodes containing the indenting and the linebreaks in your DOMDocument. %0A is a linebreak, while each + is a space.

When you compare this with your XML, you will see there is a line break after each <telefon> element followed by four spaces until the <model> element starts. Likewise, there is only a newline and two spaces between the closing <cena> and the opening <telefon>.

The given type for these nodes is 3, which - according to the list of predefined constants - is XML_TEXT_NODE, e.g. a DOMText node. In lack of a proper element name, these nodes have a name of #text.

Disregarding Whitespace

Now, when you disable preservation of whitespace, the above will output:

Name: model - Type: 1 - Value: Easy+DB
Name: proizvodjac - Type: 1 - Value: Alcatel
Name: cena - Type: 1 - Value: 25
Name: model - Type: 1 - Value: 3310
…

As you can see, there is no more #text nodes, but only type 1 nodes, which means XML_ELEMENT_NODE, e.g. DOMElement.

DOMElements contain DOMText nodes

In the beginning I said, the values of DOMElements are DOMText instances too. But in the output above, they are nowhere to be seen. That's because we are accessing the nodeValue property, which returns the value of the DOMText as string. We can prove that the value is a DOMText easily though:

$dom = new DOMDocument;
$dom->preserveWhiteSpace = FALSE;
$dom->loadXML($xml);
foreach ($dom->getElementsByTagName('telefon') as $telefon) {
    $node = $telefon->firstChild->firstChild; // 1st child of model
    printf(
        "Name: %s - Type: %s - Value: %s\n",
        $node->nodeName,
        $node->nodeType,
        urlencode($node->nodeValue)
    );
}

will output

Name: #text - Type: 3 - Value: Easy+DB
Name: #text - Type: 3 - Value: 3310
Name: #text - Type: 3 - Value: GF768
Name: #text - Type: 3 - Value: Skeleton
Name: #text - Type: 3 - Value: Earl

And this proves a DOMElement contains it's value as a DOMText and nodeValue is just returning the content of the DOMText directly.

More on nodeValue

In fact, nodeValue is smart enough to concatenate the contents of any DOMText children:

$dom = new DOMDocument;
$dom->loadXML('<root><p>Hello <em>World</em>!!!</p></root>');
$node = $dom->documentElement->firstChild; // p
printf(
    "Name: %s - Type: %s - Value: %s\n",
    $node->nodeName,
    $node->nodeType,
    $node->nodeValue
);

will output

Name: p - Type: 1 - Value: Hello World!!!

although these are really the combined values of

DOMText "Hello"
DOMElement em with DOMText "World"
DOMText "!!!"

Printing content of a XML file using XML DOM

To finally answer your question, look at the first test code. Everything you need is in there. And of course by now you have been given fine other answers too.

这篇关于使用XML DOM打印XML文件的内容的文章就介绍到这了,希望我们推荐的答案对大家有所帮助,也希望大家多多支持IT屋!

查看全文
登录 关闭
扫码关注1秒登录
发送“验证码”获取 | 15天全站免登陆