使用domdocument获取标签的内容(如果条件在标签后) [英] Get content of tag if term exists after it using domdocument

查看:62
本文介绍了使用domdocument获取标签的内容(如果条件在标签后)的处理方法,对大家解决问题具有一定的参考价值,需要的朋友们下面随着小编来一起学习吧!

问题描述

具有此 $ html

$html = '<p>random</p>
<a href="">Test 1</a> (target1)
<br>
<a href="">Test 2</a>  (target1)
<br>
<a href="">Test 3</a> (skip)
// etc
';

我在 $ array :

$array = array(
    '(target1)',
    '(target2)'
);

如何使用以下内容略过 $ html domdocument可以在 $ array 中找到所有术语,并获取位于其前面的< a> 标记的内容?

How can I skim through $html using domdocument to find all terms in $array and grab the content of the <a> tag that precedes it?

所以我得到以下结果:

$results = array(
    array(
        'text' => 'Test 1',
        'needle' => 'target1'
    ),
    array(
        'text' => 'Test 2',
        'needle' => 'target1'
    )
);



我到目前为止已经尝试过的东西



通过以下方法,我设法在 $ html 中获取了所有< a> 标记的内容:

What I've tried so far

With the following approach, I have managed to grab the content of all <a> tags in $html:

$doc = new DOMDocument();
$doc->loadHTML('<?xml encoding="utf-8" ?>' . $html);
$xpath = new DOMXPath($doc);

$elements = $xpath->query('//a'); 
$el_array = array();
if ($elements->length > 0) {
    foreach($elements as $n) {
        $node = trim(strip_tags($n->nodeValue));
        if (!empty($node)) {
            $el_array[] = $node;
        }
    }
    if (!empty($el_array) && is_array($el_array)) {
    print_r($el_array);
    }
}

但是我还没有找到一种方法来抓住目标条款,以便我可以检查我们是否有匹配项。

But I have not found a way to grab the target terms so that I can check if we have a match.

推荐答案

您可以创建包含contains和跟随兄弟的动态xpath查询。

You can create a dynamic xpath query with contains and following-sibling.

xpath表达式将是:

The xpath expression will be:

//a/following-sibling::text()[contains(., '(target1)') or contains(., '(target2)')]

例如:

$array = array(
    '(target1)',
    '(target2)'
);

$contains =  implode(" or ", array_map(function($x) {
    return "contains(., '$x')";
}, $array));

$doc = new DOMDocument();
$doc->loadHTML('<?xml encoding="utf-8" ?>' . $html);
$xpath = new DOMXPath($doc);
$elements = $xpath->query("//a/following-sibling::text()[$contains]");
$results = [];

foreach ($elements as $element) {
    $results[] = [$element->previousSibling->nodeValue, trim($element->nodeValue)];
}

print_r($results);

结果:

Array
(
    [0] => Array
        (
            [0] => Test 1
            [1] => (target1)
        )

    [1] => Array
        (
            [0] => Test 2
            [1] => (target2)
        )

)

演示

这篇关于使用domdocument获取标签的内容(如果条件在标签后)的文章就介绍到这了,希望我们推荐的答案对大家有所帮助,也希望大家多多支持IT屋!

查看全文
登录 关闭
扫码关注1秒登录
发送“验证码”获取 | 15天全站免登陆