如何在jsoup/javascript中的两个标签之间获取内容 [英] How to get contents between two tags in jsoup/javascript

查看:313
本文介绍了如何在jsoup/javascript中的两个标签之间获取内容的处理方法,对大家解决问题具有一定的参考价值,需要的朋友们下面随着小编来一起学习吧!

问题描述

<p><strong>Chapter One</strong></p><p>A piece of computer code</p>    
<table>
 <tr>
 <th>Firstname</th>
 <th>Lastname</th> 
 <th>Age</th>
 </tr>
<tr>
 <td>Jill</td>
 <td>Smith</td>
 <td>50</td>
</tr>
</table>
<p><strong>Chapter Two</strong></p><p>Java in 10 minutes</p>

如何在这两个强"之间获取内容,以便获得第一章中将包含一段计算机代码"和表格的内容? "strong"的nextSibling()只能检索一个元素,如何获取所有元素,直到遇到另一个"strong"? 谢谢

How to get contents between those two "strong" so I can get the Chapter One will have "A piece of computer code" and the table? The nextSibling() of "strong" can only retrieve one element, how to get all elements until I met another "strong"? Thanks

推荐答案

这种格式是否一致?如果是这样,您只需查询nextSibling两次以获取强元素的父级(p).

Is this format going to be consistent? If so, you can simply query nextSibling twice for the strong element's parent (p).

如果要改变,您可能需要手动检查何时停止遍历兄弟姐妹,例如验证兄弟姐妹是否包含强元素.

If it's going to vary, you might need to manually check when to stop iterating through the siblings, such as verifying if the sibling contains a strong element.

这完全取决于整个上下文.

It all depends on the full context.

这里是带有基本循环的示例.在不同情况下,您可能想添加更多检查或更好的查询.

Here's example with basic loops. You may want to add more checks or better queries given a different situation.

Document doc = Jsoup.connect(url).get();
List<Elements> data = new ArrayList<>();
Elements chapters = doc.select("p > strong");
for (Element chapter : chapters) {
    if (!chapter.ownText().toLowerCase().contains("chapter"))
        continue; //we've reached a strong element that isn't actually a chapter
    List<Element> siblings = new ArrayList<>();
    Element next = chapter.nextElementSibling();
    while (next != null) {
        if (next.ownText().toLowerCase().contains("chapter"))
            break; //we've reached the end of this chapter
        siblings.add(next);
        next = next.nextElementSibling();
    }
    data.add(new Elements(siblings));
}

这篇关于如何在jsoup/javascript中的两个标签之间获取内容的文章就介绍到这了,希望我们推荐的答案对大家有所帮助,也希望大家多多支持IT屋!

查看全文
登录 关闭
扫码关注1秒登录
发送“验证码”获取 | 15天全站免登陆