如何“刷新" LibXML2的DOMDocument实例? [英] How to "refresh" DOMDocument instances of LibXML2?

查看:97
本文介绍了如何“刷新" LibXML2的DOMDocument实例?的处理方法,对大家解决问题具有一定的参考价值,需要的朋友们下面随着小编来一起学习吧!

问题描述

使用PHP进行说明:normalizeDocument()方法中存在一个BUG,或者缺少刷新"方法,因为更改后DOM一致性丢失(甚至只有属性更改). ..因此,您使用LIBXML2 somethimes实现的任何具有DOM更改的算法"都是可行的,有时甚至是无法预测的! (?)

Using PHP to illustrate: there are a BUG in the normalizeDocument() method, or a lack of a "refresh" method, because DOM consistence is lost after changes (even only attribute changes)... So, any algorithm "with DOM changes" that you implement with LIBXML2 somethimes works and sometimes not, is unpredictable!! (?)

$doc->LoadXML($doc->saveXML());的刷新"是一种变通方法,并且在使用DOM的工作流程中失去了性能...一个子问题:我需要刷新DOM的所有时间吗?

The "refresh" by $doc->LoadXML($doc->saveXML()); is a workaround and lost performance in a flow of work with DOM... A sub-question: all moment I need to refresh DOM?

  $XML = '
  <html>
    <h1>Hello</h1>
    <ol>
        <li>test (no id)</li>
        <li xml:id="i2">test i2</li>
    </ol>
  </html>
  ';
  $doc = new DOMDocument;
  $doc->LoadXML($XML);
  doSomeChange($doc);    // here DOM is modified
  print $doc->saveXML(); // show new DOM state

  $doc->normalizeDocument(); // NOT REFRESHING!?!
  var_dump($doc->getElementById('i2'));  //NULL!??! is a BUG!
  //CAN_NOT_doMORESomeChange($doc);

  $doc->LoadXML($doc->saveXML());        // only way to refresh?
  print $doc->getElementById('i2')->tagName;  //OK, is there

  // illustrating attribute modification:
  function doSomeChange(&$dom) {
    $max = 0;
    $xp  = new DOMXpath($dom);
    foreach(iterator_to_array($xp->query('/html/* | //li')) as $e) {
        $max++;
        $e->setAttribute('xml:id',"i$max");
    }
    print "\ncmpDOM='".($xp->document === $dom)."'\n"; // after @ThomasWeinert
  }

所以,输入是$ XML,输出是

So, input is the $XML and output is

  <html>
            <h1 xml:id="i1">Hello</h1>
            <ol xml:id="i2">
                <li xml:id="i3">test (no id)</li>
                <li xml:id="i4">test i2</li>
            </ol>
        </html>
  NULL
  ol

NULL是错误(请参见代码注释).

the NULL is the bug (see code comments).

PS:如果我将输入线<li xml:id="i2">test i2</li>更改为<li>test i2</li>,则该算法将按预期运行(!),因此是不可预测的.

PS: if I change input line <li xml:id="i2">test i2</li> to <li>test i2</li> the algorithm works as expected (!), so, is unpredictable.

相关问题:在DomDocument中,DOMXpath的重用是否稳定? PHP DomDocument,可重复使用XSLTProcessor,它稳定/安全吗?

推荐答案

所做的更改将立即应用于DOM.在您的示例中,这将创建一个状态,其中两个元素具有相同的xml:id,这似乎使索引搞砸了.在设置它们之前,请删除xml:id属性,它会起作用:

Changes are applied to the DOM the moment you're doing them. In your example this creates a status where two elements have the same xml:id and this seems to screw up the index. Remove the xml:id attributes before setting them and it works:

$XML = '
  <html>
    <h1>Hello</h1>
    <ol>
        <li>test (no id)</li>
        <li xml:id="i2">test i2</li>
    </ol>
  </html>
  ';
  $doc = new DOMDocument;
  $doc->loadXML($XML);
  var_dump($doc->getElementById('i2'), $doc->getElementById('i2')->tagName);
  /*
    object(DOMElement)#2 (0) { }
    string(2) "li"
  */

  doSomeChange($doc);    // here DOM is modified

  var_dump($doc->getElementById('i2'), $doc->getElementById('i2')->tagName);
  /*
    object(DOMElement)#6 (0) { }
    string(2) "ol"
  */

  print $doc->saveXML(); // show new DOM state
  /*
  <?xml version="1.0"?>
  <html>
    <h1 xml:id="i1">Hello</h1>
    <ol xml:id="i2">
      <li xml:id="i3">test (no id)</li>
      <li xml:id="i4">test i2</li>
    </ol>
  </html>
  */

  // illustrating xml:id attribute modification:
  function doSomeChange($dom) {
    $xp  = new DOMXpath($dom);
    foreach($xp->evaluate('//*') as $e) {
      $e->removeAttribute('xml:id');
    }
    $max = 0;
    foreach($xp->evaluate('/html/*|//li') as $e) {
      $max++;
      $e->setAttribute('xml:id',"i$max");
    }
  }

您对dom的特定修改是中断getElementById()调用的原因.

Your specific dom modification is, what breaks the getElementById() calls.

对于稳定性"问题:DOMXpath和DOMDocument之间的连接不是完全稳定"的.如果在DOMDocument中使用load *()方法,则连接会丢失.您可以通过比较其文档属性来验证DOMXpath是否使用了正确的DOMDocument:

To the "stability" question: The connection between DOMXpath and DOMDocument is not completly "stable". If you're using a load*() method in the DOMDocument, the connection is lost. You can validate that the DOMXpath uses the correct DOMDocument comparing its document property:

var_dump($xpath->document === $doc);

这种情况不会发生,因为您总是在函数中创建DOMXpath的新实例.但这意味着您应该避免重新加载文档,因为这会破坏为文档创建的xpath实例.

This does not happen in your case, because you always create a new instance of DOMXpath in the function. But it means you should avoid reloading the document because this will break xpath instances created for the document.

这篇关于如何“刷新" LibXML2的DOMDocument实例?的文章就介绍到这了,希望我们推荐的答案对大家有所帮助,也希望大家多多支持IT屋!

查看全文
登录 关闭
扫码关注1秒登录
发送“验证码”获取 | 15天全站免登陆