使用 XMLReader 和 PHP 读取 XML 结束元素两次 [英] XML end element is read twice using XMLReader with PHP

查看:23
本文介绍了使用 XMLReader 和 PHP 读取 XML 结束元素两次的处理方法,对大家解决问题具有一定的参考价值,需要的朋友们下面随着小编来一起学习吧!

问题描述

我想使用 XMLReader 读取 XML 文件,但在解析过程中为每个元素调用了两次 END ELEMENT.

I want to read a XML file, using XMLReader but the END ELEMENT is twice called for each element during parsing.

<publications>
  <article id="Xu86oazdn">
    <title>Learning</title>
    <authors>
      <author>
        <firstname>Michel</firstname>
        <lastname>Browsky</lastname>
      </author>
    </authors>
  </article>
</publications>

这是解析作者条目的代码:

This is the piece of code which parse the author entries:

<?php
$xml = new XMLReader();
$xml->open("php://stdin");
$author = null;

while($xml->read()) {

  switch($xml->nodeType) {
    case XMLReader::ELEMENT:
      switch($xml->name) {
        case 'author':
          echo("+" . $xml->name);
          break;
    }

    case XMLReader::END_ELEMENT:
      switch($xml->name) {
        case 'author':
          echo("-" . $xml->name);
          break;
      }
    }
  }
?>

但奇怪的是,END_ELEMENT 为每个 </author> 调用了两次,如回显消息所示:

But strangely, the END_ELEMENT is called twice for each </author>, as shown by the echo messages:

+author
-author
-author

如果我通过调用 $xml->readOuterXML() 替换回显消息,则第一个 END_ELEMENT 如下:

If I replace the echo message by a call to $xml->readOuterXML(), the first END_ELEMENT is the following:

<author>
  <firstname>Michel</firstname>
  <lastname>Browsky</lastname>
</author>

第二个如下:

<author/>

我的代码有什么问题?我是否以错误的方式使用了 END_ELEMENT?检测结束元素的正确方法是什么?

What is wrong with my code ? Did I use END_ELEMENT in a wrong way ? What is the right way to detect the end element ?

推荐答案

nodeType<的第一个switch条件结束后添加break语句/代码>:

Add a break statement after the end of the first switch condition on the nodeType:

<?php
$xml = new XMLReader();
$xml->open("php://stdin");

while($xml->read()) {

  switch($xml->nodeType) {
    case XMLReader::ELEMENT:
      switch($xml->name) {
        case 'author':
          echo("+" . $xml->name);
          break;
    }

    // THIS LINE IS MISSING
    break;

    case XMLReader::END_ELEMENT:
      switch($xml->name) {
        case 'author':
          echo("-" . $xml->name);
          break;
      }
    }
  }
?>

在阅读 END_ELEMENT 后添加另一个 break,如果只是为了对称.

Add another break after reading the END_ELEMENT, as well, if only for symmetry.

    case XMLReader::END_ELEMENT:
      switch($xml->name) {
        case 'author':
          echo("-" . $xml->name);
          break;
      }
    }

    break;

问题的发生是因为编码风格.简化代码.例如:

The problem happened because of the coding style. Simplify the code. For example:

$xml = new XMLReader();
$xml->open("php://stdin");

while($xml->read()) {    
  switch($xml->nodeType) {
    case XMLReader::ELEMENT: {
      startElement( $xml->name );
      break;
    }

    case XMLReader::END_ELEMENT: {
      endElement( $xml->name );
      break;
    }
  }
}

您可以进一步简化.PHP 有一个 XML 编组包,但您也可以将代码抽象为类.然后,这些类的实例将能够从(或向)XML 文件读取(或写入)自己.例如:

There are further simplifications you can make. PHP has an XML marshalling package, but you could also abstract the code into classes. Instances of those classes would then be able to read (or write) themselves from (or to) an XML file. For example:

$xml = new XMLReader();
$xml->open("php://stdin");

while($xml->read()) {    
  if( $xml->name == 'author' ) {
    $author = new Author();
    $author->marshall( $xml );
  }
}

这将对象如何存储的细节与对象本身结合起来.每当您更改 Author 对象时,您都知道必须更改它自身的编组方式.您可以使用适当的设计模式、XML 模式等进一步抽象和扩展这些概念.

This couples the details of how the object is stored with the object itself. Any time you change the Author object, you know you must change how it marshalls itself. You could abstract and extend these concepts even further using appropriate design patterns, XML schemas, and so forth.

因此您的最终代码可能类似于:

Thus your final code might resemble:

$xml = new XMLReader();
$xml->open( "php://stdin" );
$publications = new Publications();
$publications->marshall( $xml );

Publications 对象负责读取 XML 文档并在其关联的 XML 标记出现时实例化相应的类:

The Publications object is responsible for reading the XML document and instantiating the appropriate classes whenever their associated XML tags appear:

while($xml->read()) {    
  $article = new Article();
  $article->marshall( $xml );
  add( $article );
}

使用 PHP 编组框架来节省您的时间和精力.考虑 XML_Serializer:

Use a PHP marshalling framework to save yourself time and effort. Consider XML_Serializer:

这篇关于使用 XMLReader 和 PHP 读取 XML 结束元素两次的文章就介绍到这了,希望我们推荐的答案对大家有所帮助,也希望大家多多支持IT屋!

查看全文
登录 关闭
扫码关注1秒登录
发送“验证码”获取 | 15天全站免登陆