如何使用Jsoup删除文本级别的所有元素? [英] How to remove all elements on text level with Jsoup?

查看:60
本文介绍了如何使用Jsoup删除文本级别的所有元素?的处理方法,对大家解决问题具有一定的参考价值,需要的朋友们下面随着小编来一起学习吧!

问题描述

我正在从事一个项目,我只对页面布局感兴趣,而对文本不感兴趣.我目前难以摆脱文本级的每个元素.例如:

I'm working on a project and i'm only interested in the page layout and not in the text. I'm currently having trouble getting rid of every element at text level. for example:

<div>
    <ul>
        <li>some menu item</li>
        <li>some menu item</li>
        <li>some menu item</li>
    </ul>
</div>
<div>
    <h3>Tile of some text</h3>
    <p></p>
    <p>some text</p>
    <ul>
        <li>some other text</li>
        <li>some other text</li>
        <li>some other text</li>
    </ul>
</div>

我想摆脱文本级别上的 ul,li,p和h3 元素,但保留div和带有菜单项的列表,因为这是页面布局的一部分.如何使用Jsoup做到这一点?

I want to get rid of the ul, li, p and h3 elements on text level but keep the div and the list with menu items as this is part of the layout of the page. How do I do this with Jsoup?

我一直在尝试使用document.select()然后.remove()元素来执行此操作,但是对于此类非标准查询未使用select函数.

I've been trying to do this with the document.select() and then .remove() the elements but the select function is not made for such non standard queries.

我想要得到的最终结果是:

The end result I want to get is:

<div>
    <ul>
        <li>some menu item</li>
        <li>some menu item</li>
        <li>some menu item</li>
    </ul>
</div>
<div>

</div>

如您所见,当ul标签与其中包含文本的标签处于同一级别时,它将删除列表. ul标签是页面上文本的一部分,与页面的布局无关.带有菜单项的ul标签对于页面很重要,因为它定义了那里的菜单,并且有3个不同的项.

As you can see it removes the list when the ul tag is on the same level as tags with text in them. The ul tag is part of the text that is on the page and has nothing to do with the layout of the page. The ul tag with menu items is important for the page as this defines there is a menu there and it has 3 different items.

推荐答案

您可以选择并删除带有标准的所有pliul元素:

You can select and remove all p, li and ul elements with standard:

doc.select("p").remove();
doc.select("ul").remove();
doc.select("li").remove();

这篇关于如何使用Jsoup删除文本级别的所有元素?的文章就介绍到这了,希望我们推荐的答案对大家有所帮助,也希望大家多多支持IT屋!

查看全文
登录 关闭
扫码关注1秒登录
发送“验证码”获取 | 15天全站免登陆