使用domdocument获取标签的内容(如果条件在标签后) [英] Get content of tag if term exists after it using domdocument
本文介绍了使用domdocument获取标签的内容(如果条件在标签后)的处理方法,对大家解决问题具有一定的参考价值,需要的朋友们下面随着小编来一起学习吧!
问题描述
具有此 $ html
:
$html = '<p>random</p>
<a href="">Test 1</a> (target1)
<br>
<a href="">Test 2</a> (target1)
<br>
<a href="">Test 3</a> (skip)
// etc
';
我在 $ array
:
$array = array(
'(target1)',
'(target2)'
);
如何使用以下内容略过 $ html
domdocument可以在 $ array
中找到所有术语,并获取位于其前面的< a>
标记的内容?
How can I skim through $html
using domdocument to find all terms in $array
and grab the content of the <a>
tag that precedes it?
所以我得到以下结果:
$results = array(
array(
'text' => 'Test 1',
'needle' => 'target1'
),
array(
'text' => 'Test 2',
'needle' => 'target1'
)
);
我到目前为止已经尝试过的东西
通过以下方法,我设法在 $ html
中获取了所有< a>
标记的内容:
What I've tried so far
With the following approach, I have managed to grab the content of all <a>
tags in $html
:
$doc = new DOMDocument();
$doc->loadHTML('<?xml encoding="utf-8" ?>' . $html);
$xpath = new DOMXPath($doc);
$elements = $xpath->query('//a');
$el_array = array();
if ($elements->length > 0) {
foreach($elements as $n) {
$node = trim(strip_tags($n->nodeValue));
if (!empty($node)) {
$el_array[] = $node;
}
}
if (!empty($el_array) && is_array($el_array)) {
print_r($el_array);
}
}
但是我还没有找到一种方法来抓住目标条款,以便我可以检查我们是否有匹配项。
But I have not found a way to grab the target terms so that I can check if we have a match.
推荐答案
您可以创建包含contains和跟随兄弟的动态xpath查询。
You can create a dynamic xpath query with contains and following-sibling.
xpath表达式将是:
The xpath expression will be:
//a/following-sibling::text()[contains(., '(target1)') or contains(., '(target2)')]
例如:
$array = array(
'(target1)',
'(target2)'
);
$contains = implode(" or ", array_map(function($x) {
return "contains(., '$x')";
}, $array));
$doc = new DOMDocument();
$doc->loadHTML('<?xml encoding="utf-8" ?>' . $html);
$xpath = new DOMXPath($doc);
$elements = $xpath->query("//a/following-sibling::text()[$contains]");
$results = [];
foreach ($elements as $element) {
$results[] = [$element->previousSibling->nodeValue, trim($element->nodeValue)];
}
print_r($results);
结果:
Array
(
[0] => Array
(
[0] => Test 1
[1] => (target1)
)
[1] => Array
(
[0] => Test 2
[1] => (target2)
)
)
这篇关于使用domdocument获取标签的内容(如果条件在标签后)的文章就介绍到这了,希望我们推荐的答案对大家有所帮助,也希望大家多多支持IT屋!
查看全文