如何使用Jsoup提取单独的文本节点? [英] How to extract separate text nodes with Jsoup?
问题描述
我有这样的元素:
<td> TextA <br/> TextB </td>
如何单独提取TextA和TextB?
How can I extract TextA and TextB separately?
推荐答案
有几种方法。这实际上取决于文档本身以及给定的HTML标记是否一致。在这个特定的例子中,您可以通过 td 的子节点。 html#childNodes%28%29rel =noreferrer> 元素#childNodes()
然后单独测试每个节点是否为 TextNode
与否。
Several ways. That really depends on the document itself and whether the given HTML markup is consistent or not. In this particular example you could get the td
's child nodes by Element#childNodes()
and then test every node individually if it's a TextNode
or not.
例如
Element td = getItSomehow();
for (Node child : td.childNodes()) {
if (child instanceof TextNode) {
System.out.println(((TextNode) child).text());
}
}
导致
TextA
TextB
我认为如果Jsoup会很好提供了一个元素#textNodes()
或其他东西来获取子文本节点,如元素#children()
到获取子元素(在您的示例中将返回< br />
元素)。
I think it would be nice if Jsoup offered a Element#textNodes()
or something to get the child text nodes like as Element#children()
does to get the child elements (which would have returned the <br />
element in your example).
这篇关于如何使用Jsoup提取单独的文本节点?的文章就介绍到这了,希望我们推荐的答案对大家有所帮助,也希望大家多多支持IT屋!