为什么此DOM-replaceNode函数有时会崩溃? [英] Why this DOM-replaceNode function sometimes crashes?

查看:57
本文介绍了为什么此DOM-replaceNode函数有时会崩溃?的处理方法,对大家解决问题具有一定的参考价值,需要的朋友们下面随着小编来一起学习吧!

问题描述

第一个函数(如下)可以正常工作,可以在同一 DOMDocument ... 但有时会崩溃(没有错误消息,但正在停止服务器).

The first function (below) works fine, in a loop over many nodes of the same DOMDocument... But sometimes crashes (no error message but stopping the server).

当我们在同一个节点循环中使用第二个( replace_innerXML_secure )时,它永远不会崩溃.为什么?第一个问题是什么?

When we use the second (replace_innerXML_secure), in the same node loop, it never crashes. Why? What is wrong with the first?

  • 首先使用 $ e-> nodeValue =''删除所有 childNodes (可以吗?);
  • 第二个保留一个(任意) childNode 并使用removeChild 删除...一种扩展的解决方法,可以避免某些标签在那里时被完全删除.
  • The first use $e->nodeValue='' to delete all childNodes (it its ok?);
  • The second preserves one (arbitrary) childNode and use removeChild to delete... A extrange workaround to avoid full deletion when some tag was there.

等价"功能#1和#2:

The "equivalent" functions #1 and #2:

// 1. What is wrong with THIS function??
function replace_innerXML(DOMNode $e,$innerXML='') {
    if ($e && ($innerXML>'' || $e->nodeValue>'')) {
        $e->nodeValue='';   
        if ($innerXML>'') {
            $tmp = $this->dom->createDocumentFragment();
            $tmp->appendXML($innerXML);
            $e->appendChild( $tmp );
        }
        return true;
    }
    return false;
}

// 2. Here a workaround... slower but... NOT crashes (!), WHY??
function replace_innerXML_secure(DOMNode $e,$innerXML='') {
    if ($e) {
        $tmp = $e->ownerDocument->createDocumentFragment();
        $tmp->appendXML($innerXML);
        $once=null;         
        foreach(iterator_to_array($e->childNodes) as $e2)
            if (!$once && $e2->nodeType===1) $once=$e2;
            else $e->removeChild($e2);
        if ($once)
            $once->parentNode->replaceChild( $tmp, $once );
        else {
            $e->nodeValue='';
            $e->appendChild( $tmp );
        }
        return true;
    }
    return false;
}


注释

EDIT2 用于@Prix请求,例如.


NOTES

EDIT2 for @Prix request, some example.

循环非常复杂,但是可以模拟为

The loop is very complex, but it can be simulated as

   // use this with ANY (and a lot of) BIG HTML files from web... 
   // I have ~1 error/100 samples  
   $dom = new DOMDocument();
   $dom->load($file); // any XML, or loadHTMLfile() 

   $plst = array();  // you can take off the rand()
   foreach ($dom->getElementsByTagName('*') as $node) if (1 || rand(1,3)==1) {
      $plst[] = $node->getNodePath();
   }
   rsort($plst); // from leaves to root
   foreach ($plst as $p) {
      $xp = new DOMXpath($dom); // refresh for each $p
      $node = $xp->query($p);
      if ($node->length && $node=$node->item(0))
          // USING HERE the function#1 or #2:
          replace_innerXML($node,'<new x="1">text</new>');
   }
   $dom->normalizeDocument();

这里有一些$ dom的示例XML,但是您可以使用任何 $ dom-> loadHTML($ file)进行测试(!).

Here some sample XML for $dom, but you can use any $dom->loadHTML($file) to test (!).

  <?xml version="1.0" encoding="utf-8"?>
  
  <article dtd-version="3.0" article-type="research-article" xml:lang="en">
    <front><journal-meta>
        <journal-title-group><journal-title>text text text</journal-title>
        <abbrev-journal-title abbrev-type="acronym">aaaa</abbrev-journal-title>
        <abbrev-journal-title abbrev-type="publisher">aaabbb aaa</abbrev-journal-title>
        </journal-title-group>
        <etc>....</etc>
        <history><date date-type="received"><label>Received</label> 9 July 2014</date>
            <date date-type="accepted"><label>Accepted</label> 25 July 2014</date>
        </history>
    </journal-meta></front>
    <body>
        <p>Nonnnononn onononono  nonono</p>
        <fn><p><label>XXXXX yyyyy</label>: xxxx@aaa.com</p></fn>
  
        <p>Nonnnononn onononono  nonono nonono </p>
    </body>
  </article>

EDIT1 (版本和日志)

版本:

  • libxml2: 2.8.0 + dfsg1-7 + wheezy1
  • php5 :5.4.4-14 + deb7u14
  • apache2 :2.2.22-13 + deb7u3
  • libxml2: 2.8.0+dfsg1-7+wheezy1
  • php5: 5.4.4-14+deb7u14
  • apache2: 2.2.22-13+deb7u3

日志:在哪里?我只知道/var/log/apache2/error.log ,但那里没有错误(成功的http中只有一个常用的png文件不存在").

Logs: where? I know only /var/log/apache2/error.log, but no error there (only a usual png "File does not exist" that are in a sucess http).

...在此机器上,今天再次运行,http崩溃后,没有大的错误报告,仅文件不存在:/var/www/favicon.ico"崩溃之前...但是我也在Ubuntu机器上运行,在其中我发现(!)有关崩溃日期和瞬间的报告:

... in this machine, running again today, after http crashes, no big error reported, only "File does not exist: /var/www/favicon.ico" before the crash... But I was running also in a Ubuntu machine, where I find (!) a report about the date and instant of a crash:

 [Wed Oct 15 20:16:16.840578 2014] [core:notice] [pid 1770] AH00051: child pid 14873 exit signal Segmentation fault (11), possible coredump in /etc/apache2
 [Wed Oct 15 20:16:16.840684 2014] [core:notice] [pid 1770] AH00051: child pid 14879 exit signal Segmentation fault (11), possible coredump in /etc/apache2
 *** Error in `/usr/sbin/apache2': corrupted double-linked list: 0x00007f457b81af70 ***
 [Wed Oct 15 20:16:56.886473 2014] [core:notice] [pid 1770] AH00051: child pid 14844 exit signal Aborted (6), possible coredump in /etc/apache2
 [Wed Oct 15 20:16:57.887638 2014] [core:notice] [pid 1770] AH00051: child pid 14894 exit signal Segmentation fault (11), possible coredump in /etc/apache2

是的,崩溃很大,不知道为什么.(我记得 LibXML2 中的标准coredump问题"是删除或写入不存在的节点).

yes, a big crash, no clue about why. (I remember that the "standard coredump problem" in LibXML2 is delete or write nodes that not exists).

推荐答案

虽然我没有发现任何奇怪的代码(在我的计算机上用一些XML对其进行了测试,但没有发现问题),但我怀疑某些东西正在使用它在某种程度上,这导致无限递归.

While I didn't find anything odd about the code (tested it on my machine with a few XMLs and found no problems), I suspect that something uses it in a way, that leads to an infinite recursion.

输入太深的递归函数导致PHP生成 SEGFAULT .[ 1

Functions that enter too deep recursion are known for causing PHP to SEGFAULTs. [1, 2] Either that, or a serious PHP/libxml2 bug.

问题可能出在其他地方吗?

Perhaps the problem lies elsewhere?

这篇关于为什么此DOM-replaceNode函数有时会崩溃?的文章就介绍到这了,希望我们推荐的答案对大家有所帮助,也希望大家多多支持IT屋!

查看全文
登录 关闭
扫码关注1秒登录
发送“验证码”获取 | 15天全站免登陆