遍历DOM向后查找ID [英] Traverse DOM find id backwards

查看:153
本文介绍了遍历DOM向后查找ID的处理方法,对大家解决问题具有一定的参考价值,需要的朋友们下面随着小编来一起学习吧!

问题描述

我无法找到如何解决这个问题。

I can't find out how to solve this

<div>
  <p id="p1"> Price is  <span>$ 25</span></p>
  <p id='p2'> But this price is $ <span id="s1">50,23</span> </p>
  <p id='p3'> This one :  $ 14540.12 dollar</p>
</div>

我想要做的是找到一个有价格的元素,它是最短的路径它。
这是我有sofar。

What i'm trying to do is find an element with a price in it and it's shortest path to it. This is what i have sofar.

$elements = $dom->getElementsByTagName('*');

foreach($elements as $child)
{
   if (preg_match("/.$regex./",$child->nodeValue)){
      echo $child->getNodePath(). "<br />";

   }
}

这将导致

/html
/html/body
/html/body/div
/html/body/div/p[1]
/html/body/div/p[1]/span
/html/body/div/p[2]
/html/body/div/p[2]/span
/html/body/div/p[3]

我想要的元素的路径,所以这个测试HTML是OK的。但是在实际的网页中,这些路径很长,容易出错。
我想做的是找到最近的元素与ID属性,并参考这个。

These are the paths to the elements i want, so that's OK in this test HTML. But in real webpages these path's get very long and are error prone. What i'd like to do is find the closest element with an ID attribute and refer to that.

所以一旦找到和元素匹配的$ regex ,我需要浏览DOM并找到第一个元素和ID属性,并从中创建新的较短路径。
在上面的HTML示例中,有3个价格匹配$ regex。价格在:

So once found and element that matched the $regex, I need to travel up the DOM and find the first element with and ID attribute and create the new shorter path from that. In the HTML example above, there are 3 prices matching the $regex. The prices are in:

//p[@id="p1"]/span
//p[@id="s1"]
//p[@id="p3"]

所以这是我想从我的功能返回的。我还需要摆脱存在的所有其他路径,因为它们不包含$ regex

So that is what i'd like to have returned from my function. The means I also need to get rid of all the other paths that exist, because they don't contain $regex

有什么帮助吗?

推荐答案

您可以使用XPath跟踪包含 @id 的第一个节点的祖先路径属性,然后将其路径切断。没有清理代码,但是这样的东西:

You could use XPath to follow the ancestor-path to the first node containing an @id attribute and then cut its path off. Did not clean up the code, but something like this:

// snip
$xpath = new DomXPath($doc);
foreach($elements as $child)
{
    $textValue = '';
    foreach ($xpath->query('text()', $child) as $text)
        $textValue .= $text->nodeValue;
    if (preg_match("/.$regex./", $textValue)) {
        $path = $child->getNodePath();
        $id = $xpath->query('ancestor-or-self::*[@id][1]', $child)->item(0);
        $idpath = '';
        if ($id) {
            $idpath = $id->getNodePath();
            $path = '//'.$id->nodeName.'[@id="'.$id->attributes->getNamedItem('id')->value.'"]'.substr($path, strlen($idpath));
        }
        echo $path."\n";
   }
}

打印像

/html
/html/body
/html/body/div
//p[@id="p1"]
//p[@id="p1"]/span
//p[@id="p2"]
//span[@id="s1"]
//p[@id="p3"]

这篇关于遍历DOM向后查找ID的文章就介绍到这了,希望我们推荐的答案对大家有所帮助,也希望大家多多支持IT屋!

查看全文
登录 关闭
扫码关注1秒登录
发送“验证码”获取 | 15天全站免登陆