PHP XML Expat解析器:如何仅读取XML文档的一部分? [英] PHP XML Expat parser: how to read only part of the XML document?

查看:101
本文介绍了PHP XML Expat解析器:如何仅读取XML文档的一部分?的处理方法,对大家解决问题具有一定的参考价值,需要的朋友们下面随着小编来一起学习吧!

问题描述

我有一个具有以下结构的XML文档:

I have an XML document with the following structure:

<posts>
<user id="1222334">
  <post>
    <message>hello</message>
    <client>client</client>
    <time>time</time>
  </post>
  <post>
    <message>hello client how can I help?</message>
    <client>operator</client>
    <time>time</time>
  </post>
</user>
<user id="2333343">
  <post>
    <message>good morning</message>
    <client>client</client>
    <time>time</time>
  </post>
  <post>
    <message>good morning how can I help?</message>
    <client>operator</client>
    <time>time</time>
  </post>
</user>
</posts>

我能够创建解析器并打印出整个文档,但是问题是我只想打印(用户)节点和具有特定属性(id)的孩子.

I am able to create the parser and print out the whole document, the problem is however that I want to print only the (user) node and children with a specific attribute (id).

我的PHP代码是:

if( !empty($_GET['id']) ){
    $id = $_GET['id'];
    $parser=xml_parser_create();
    function start($parser,$element_name,$element_attrs)
      {
    switch($element_name)
        {
        case "USER": echo "-- User --<br>";
        break;
        case "CLIENT": echo "Name: ";
        break;
        case "MESSAGE": echo "Message: ";
        break;
        case "TIME": echo "Time: ";
        break;
        case "POST": echo "--Post<br> ";
        }
  }

function stop($parser,$element_name){  echo "<br>";  }
function char($parser,$data){ echo $data; }
xml_set_element_handler($parser,"start","stop");
xml_set_character_data_handler($parser,"char");

$file = "test.xml";
$fp = fopen($file, "r");
while ($data=fread($fp, filesize($file)))
  {
  xml_parse($parser,$data,feof($fp)) or 
  die (sprintf("XML Error: %s at line %d", 
  xml_error_string(xml_get_error_code($parser)),
  xml_get_current_line_number($parser)));
  }
xml_parser_free($parser);
}

start()函数中使用它可以选择正确的节点,但是它对读取过程没有任何影响:

using this in the start() function can select the right node but it doesn't have any effect on the reading process:

    if(($element_name == "USER") && $element_attrs["ID"] && ($element_attrs["ID"] == "$id"))

任何帮助将不胜感激

更新: XMLReader可以工作,但是在使用if语句时它会停止工作:

UPDATE: XMLReader works but when using if statement it stops working:

foreach ($filteredUsers as $user) {
echo "<table border='1'>";
foreach ($user->getChildElements('post') as $index => $post) {

    if( $post->getChildElements('client') == "operator" ){
    printf("<tr><td class='blue'>%s</td><td class='grey'>%s</td></tr>", $post->getChildElements('message'), $post->getChildElements('time'));
    }else{
    printf("<tr><td class='green'>%s</td><td class='grey'>%s</td></tr>", $post->getChildElements('message'), $post->getChildElements('time'));

    }
}
echo "</table>";
}

推荐答案

如前面的注释中所建议,您也可以使用 XMLReader 文档 .

As suggested in a comment earlier, you can alternatively use the XMLReaderDocs.

XMLReader扩展是XML Pull解析器.阅读器充当光标,在文档流上前进,并在途中停在每个节点上.

The XMLReader extension is an XML Pull parser. The reader acts as a cursor going forward on the document stream and stopping at each node on the way.

这是一个可以打开文件的类(同名:XMLReader).默认情况下,您使用next()移至下一个节点.然后,您将检查当前位置是否在某个元素上,然后检查该元素是否具有您要查找的名称,然后可以对其进行处理,例如,通过读取元素

It is a class (with the same name: XMLReader) which can open a file. By default you use next() to move to the next node. You would then check if the current position is at an element and then if the element has the name you're looking for and then you could process it, for example by reading the outer XML of the element XMLReader::readOuterXml()Docs.

与Expat解析器中的回调相比,这有点麻烦.为了获得XMLReader的更多灵活性,我通常创建自己能够处理XMLReader对象并提供所需步骤的迭代器.

Compared with the callbacks in the Expat parser, this is a little burdensome. To gain more flexibility with XMLReader I normally create myself iterators that are able to work on the XMLReader object and provide the steps I need.

它们允许直接使用foreach遍历具体元素.这是一个例子:

They allow to iterate over the concrete elements directly with foreach. Here is such an example:

require('xmlreader-iterators.php'); // https://gist.github.com/hakre/5147685

$xmlFile = '../data/posts.xml';

$ids = array(3, 8);

$reader = new XMLReader();
$reader->open($xmlFile);

/* @var $users XMLReaderNode[] - iterate over all <user> elements */
$users = new XMLElementIterator($reader, 'user');

/* @var $filteredUsers XMLReaderNode[] - iterate over elements with id="3" or id="8" */
$filteredUsers = new XMLAttributeFilter($users, 'id', $ids);

foreach ($filteredUsers as $user) {
    printf("---------------\nUser with ID %d:\n", $user->getAttribute('id'));
    echo $user->readOuterXml(), "\n";
}

我已经创建了一个XML文件,其中包含更多的帖子,例如您的问题,这些帖子在id属性中从上到下编号:

I have create an XML file that contains some more posts like in your question, numbered in the id attribute from one and up:

$xmlFile = '../data/posts.xml';

然后,我创建了一个数组,该数组具有感兴趣的用户的两个ID值:

Then I created an array with two ID values of the user interested in:

$ids = array(3, 8);

稍后将在过滤条件中使用它.然后创建XMLReader并由此打开XML文件:

It will be used in the filter-condition later. Then the XMLReader is created and the XML file is opened by it:

$reader = new XMLReader();
$reader->open($xmlFile);

下一步将在该阅读器的所有<user>元素上创建一个迭代器:

The next step creates an iterator over all <user> elements of that reader:

$users = new XMLElementIterator($reader, 'user');

然后将其中哪些过滤为先前存储在数组中的id属性值:

Which are then filtered for the id attribute values stored into the array earlier:

$filteredUsers = new XMLAttributeFilter($users, 'id', $ids);

由于所有条件都已制定,其余的现在以foreach进行迭代:

The rest is iterating with foreach now as all conditions have been formulated:

foreach ($filteredUsers as $user) {
    printf("---------------\nUser with ID %d:\n", $user->getAttribute('id'));
    echo $user->readOuterXml(), "\n";
}

将返回ID为3和8的用户的XML:

which will return the XML of the users with the IDs 3 and 8:

---------------
User with ID 3:
<user id="3">
        <post>
            <message>message</message>
            <client>client</client>
            <time>time</time>
        </post>
    </user>
---------------
User with ID 8:
<user id="8">
        <post>
            <message>message 8.1</message>
            <client>client</client>
            <time>time</time>
        </post>
        <post>
            <message>message 8.2</message>
            <client>client</client>
            <time>time</time>
        </post>
        <post>
            <message>message 8.3</message>
            <client>client</client>
            <time>time</time>
        </post>
    </user>

XMLReaderNode XMLReader迭代器 的一部分,它也提供了 SimpleXMLElement 文档 ,以方便您读取元素.

The XMLReaderNode which is part of the XMLReader iterators does also provide a SimpleXMLElementDocs in case you want to easily read values inside of the <user> element.

下面的示例演示如何获取<user>元素内的<post>元素的数量:

The following example shows how to get the count of <post> elements inside the <user> element:

foreach ($filteredUsers as $user) {
    printf("---------------\nUser with ID %d:\n", $user->getAttribute('id'));
    echo $user->readOuterXml(), "\n";
    echo "Number of posts: ", $user->asSimpleXML()->post->count(), "\n";
}

然后将为用户ID 3显示Number of posts: 1,为用户ID 8显示Number of posts: 3.

This would then display Number of posts: 1 for the user ID 3 and Number of posts: 3 for the user ID 8.

但是,如果外部XML很大,则您不想这样做,而希望继续在该元素内部进行迭代:

However, if that outer XML is large, you don't want to do that and you want to continue to iterate inside that element:

// rewind
$reader->open($xmlFile);

foreach ($filteredUsers as $user) {
    printf("---------------\nUser with ID %d:\n", $user->getAttribute('id'));
    foreach ($user->getChildElements('post') as $index => $post) {
        printf(" * #%d: %s\n", ++$index, $post->getChildElements('message'));
    }
    echo "Number of posts: ", $index, "\n";
}

哪个会产生以下输出:

---------------
User with ID 3:
 * #1: message 3
Number of posts: 1
---------------
User with ID 8:
 * #1: message 8.1
 * #2: message 8.2
 * #3: message 8.3
Number of posts: 3

此示例显示:根据嵌套子代的大小,您可以通过getChildElements()使用可用的迭代器进一步遍历,或者也可以在子集上使用SimpleXML甚至DOMDocument这样的通用XML解析器XML.

This example shows: depending on how large the nested children are, you can traverse further with the iterators available via getChildElements() or you can use as well the common XML parser like SimpleXML or even DOMDocument on a subset of the XML.

这篇关于PHP XML Expat解析器:如何仅读取XML文档的一部分?的文章就介绍到这了,希望我们推荐的答案对大家有所帮助,也希望大家多多支持IT屋!

查看全文
登录 关闭
扫码关注1秒登录
发送“验证码”获取 | 15天全站免登陆