简单的 HTML DOM 解析器 - 跳过某些元素 [英] Simple HTML DOM Parser - Skip certain element

查看：26 发布时间：2021/9/24 18:51:37 php web-scraping

本文介绍了简单的 HTML DOM 解析器 - 跳过某些元素的处理方法，对大家解决问题具有一定的参考价值，需要的朋友们下面随着小编来一起学习吧！

问题描述

我正在使用简单的 HTML DOM 解析器，我想完全忽略嵌套"元素的内容并获取正在进行的pre"元素的内容.


<div class="嵌套"><pre>我想忽略的文本</pre>
<预>这是我要访问的文本

我无法控制 HTML 源代码，所有者最近添加了嵌套"元素.在我通过这样做访问我需要的内容之前:

$page_contents = file_get_html($url);$div_content = $page_contents->find('div[id=parent]pre', 0)->innertext;

但显然新的嵌套元素破坏了我的方法.

我似乎找不到关于这种情况的任何官方文档.

解决方案

未测试但试试这个

$div_content = $page_contents->find('div[id=parent][class!=nested]pre', 0)->innertext;

或

$div_content = $page_contents->find('div[id=parent class!=nested]pre', 0)->innertext;

或者甚至只是这个我认为这真的是一个，但我还没有测试过

$div_content = $page_contents->find('div[class!=nested]pre', 1)->innertext;

仍然不知道这是否有效，但试试这个

$div_content = $page_contents->find('div[class!=nested pre]', 0)->innertext;

或

$div_content = $page_contents->find('div[class!=nested pre]', 0)->plaintext;

I am using the Simple HTML DOM Parser and I want to completely ignore the contents of the "nested" element and get the contents of the proceeding "pre" element.

<div id=parent>

<div class="nested">
<pre>Text that I want ignored</pre>
</div>

<pre>
This is the text I want to access
</pre>
</div>

I don't have control of the HTML source, and the owner has recently added the "nested" element. Before I accessed the content I needed by doing so:

$page_contents = file_get_html($url);    
$div_content = $page_contents->find('div[id=parent]pre', 0)->innertext;

But obviously the new nested element has broken my method.

I can't seem to find any official documentation regarding this kind of scenario.

解决方案

not tested but try this

$div_content = $page_contents->find('div[id=parent][class!=nested]pre', 0)->innertext;

$div_content = $page_contents->find('div[id=parent class!=nested]pre', 0)->innertext;

or maybe even just this I think this is really the one but again I have not tested

$div_content = $page_contents->find('div[class!=nested]pre', 1)->innertext;

still don't know if this will work but try this

$div_content = $page_contents->find('div[class!=nested pre]', 0)->innertext;

$div_content = $page_contents->find('div[class!=nested pre]', 0)->plaintext;

这篇关于简单的 HTML DOM 解析器 - 跳过某些元素的文章就介绍到这了，希望我们推荐的答案对大家有所帮助，也希望大家多多支持IT屋！

查看全文

简单的 HTML DOM 解析器 - 跳过某些元素 [英] Simple HTML DOM Parser - Skip certain element

问题描述

相关文章

PHP最新文章

热门教程

热门工具

登录关闭

简单的 HTML DOM 解析器 - 跳过某些元素 [英] Simple HTML DOM Parser - Skip certain element

问题描述

相关文章

PHP最新文章

热门教程

热门工具

登录 关闭

登录关闭