简单的 HTML DOM 解析器 - 跳过某些元素 [英] Simple HTML DOM Parser - Skip certain element

查看:26
本文介绍了简单的 HTML DOM 解析器 - 跳过某些元素的处理方法,对大家解决问题具有一定的参考价值,需要的朋友们下面随着小编来一起学习吧!

问题描述

我正在使用简单的 HTML DOM 解析器,我想完全忽略嵌套"元素的内容并获取正在进行的pre"元素的内容.

<div class="嵌套"><pre>我想忽略的文本</pre>

<预>这是我要访问的文本

我无法控制 HTML 源代码,所有者最近添加了嵌套"元素.在我通过这样做访问我需要的内容之前:

$page_contents = file_get_html($url);$div_content = $page_contents->find('div[id=parent]pre', 0)->innertext;

但显然新的嵌套元素破坏了我的方法.

我似乎找不到关于这种情况的任何官方文档.

解决方案

未测试但试试这个

$div_content = $page_contents->find('div[id=parent][class!=nested]pre', 0)->innertext;

$div_content = $page_contents->find('div[id=parent class!=nested]pre', 0)->innertext;

或者甚至只是这个我认为这真的是一个,但我还没有测试过

$div_content = $page_contents->find('div[class!=nested]pre', 1)->innertext;

仍然不知道这是否有效,但试试这个

$div_content = $page_contents->find('div[class!=nested pre]', 0)->innertext;

$div_content = $page_contents->find('div[class!=nested pre]', 0)->plaintext;

I am using the Simple HTML DOM Parser and I want to completely ignore the contents of the "nested" element and get the contents of the proceeding "pre" element.

<div id=parent>

<div class="nested">
<pre>Text that I want ignored</pre>
</div>

<pre>
This is the text I want to access
</pre>
</div>

I don't have control of the HTML source, and the owner has recently added the "nested" element. Before I accessed the content I needed by doing so:

$page_contents = file_get_html($url);    
$div_content = $page_contents->find('div[id=parent]pre', 0)->innertext;

But obviously the new nested element has broken my method.

I can't seem to find any official documentation regarding this kind of scenario.

解决方案

not tested but try this

$div_content = $page_contents->find('div[id=parent][class!=nested]pre', 0)->innertext;

or

$div_content = $page_contents->find('div[id=parent class!=nested]pre', 0)->innertext;

or maybe even just this I think this is really the one but again I have not tested

$div_content = $page_contents->find('div[class!=nested]pre', 1)->innertext;

still don't know if this will work but try this

$div_content = $page_contents->find('div[class!=nested pre]', 0)->innertext;

or

$div_content = $page_contents->find('div[class!=nested pre]', 0)->plaintext;

这篇关于简单的 HTML DOM 解析器 - 跳过某些元素的文章就介绍到这了,希望我们推荐的答案对大家有所帮助,也希望大家多多支持IT屋!

查看全文
登录 关闭
扫码关注1秒登录
发送“验证码”获取 | 15天全站免登陆