提取内部元素而不循环 [英] Extract inner element without looping
问题描述
如果我遍历所有 ,并在第一个之后立即中断,请从以下示例HTML代码中提取href值:
< li class =parts partname parts_first>
< div id =dpdn10uri =/ public / page / part1class =partype partstate>
< div class =ptctainer>
< div class =ptitle>
< p class =ptypead>
< span class =rtext>< a href =http://www.example.com/page/ptname.html?dv=rfirstclass =mnLabel> First<一个>< /跨度>
< span class =ndx>
< a href =#dndx =dpdn10class =xpnd _tstyle =opacity:1>详情:< / a>
< / span>
< / p>
< / div>
< / div>
< div id =dpdn10_contentclass =xpns>
< div class =ptctainer>
< div class =ptitle>
< p class =ptypead>
< span class =rtext>< a href =http://www.example.com/page/ptname.html?dv=rfirstclass =mnLabel> First<一个>< /跨度>
< span class =ndx>< a href =#class =xpnd>详情:< / a>< / span>
< / p>
< / div>
< / div>
< / div>
< / div>
< / li>
当我可以假设href值对于 ,如上面的例子。
但是,如果这些方法不相同,并且我想提取特定的一个(第一个或第二个)。
这让我在Jsoup中寻找一个允许嵌套选择的机制:直到现在我已经熟悉单层选择,如下所示:
/ p> 元素链接= doc.select(a [href]); // a with href
元素pngs = doc.select(img [src $ =。png]); // img with src ending .png
Element masthead = doc.select(div.masthead)。first(); // div with class = masthead
但是我找不到多层次的文档或示例选择,例如
元素链接= doc.select(div.xpns.div.ptctainer.div.ptitle.p.ptypead .span.rtext);
当然,以上仅用于说明而非实际的语法。我不知道在Jsoup中是否有这样的可能(还)。
Jsoup中存在这种嵌套选择吗?
你难道不能'链接'选择功能吗?喜欢:
元素链接= doc.select(div.xpns)。select(div.ptctainer)。select ( div.ptitle)选择( p.ptypead)选择( span.rtext);
Extracting the href value from the following sample HTML code is straight forward if I loop through all and break immediately after the first one:
<li class="parts partname parts_first">
<div id="dpdn10" uri="/public/page/part1" class="partype partstate">
<div class="ptctainer">
<div class="ptitle">
<p class="ptypead">
<span class="rtext"><a href="http://www.example.com/page/ptname.html?dv=rfirst" class="mnLabel">First</a></span>
<span class="ndx">
<a href="#" dndx="dpdn10" class="xpnd _t" style="opacity:1">Details: </a>
</span>
</p>
</div>
</div>
<div id="dpdn10_content" class="xpns">
<div class="ptctainer">
<div class="ptitle">
<p class="ptypead">
<span class="rtext"><a href="http://www.example.com/page/ptname.html?dv=rfirst" class="mnLabel">First</a></span>
<span class="ndx"><a href="#" class="xpnd">Details: </a></span>
</p>
</div>
</div>
</div>
</div>
</li>
I can certainly do that when I can assume the href value is identical for both instances of as in the example above.
However, this approach fails if they are not identical and I want to extract a specific one (either the first or the second).
Which brings me to searching for a mechanism in Jsoup that allows "nested selection": Up until now I have been familiar with single-level selection as in:
Elements links = doc.select("a[href]"); // a with href
Elements pngs = doc.select("img[src$=.png]"); // img with src ending .png
Element masthead = doc.select("div.masthead").first(); // div with class=masthead
But I can't find documentation or an example for multi-level selection, e.g.
Element link= doc.select("div.xpns.div.ptctainer.div.ptitle.p.ptypead.span.rtext");
The above is for illustration and not real syntax, of course. I don't know if something like this is possible (yet) in Jsoup.
Does such "nested selection" exist in Jsoup?
Can't you just 'chain' the selection functions? Like:
Element link = doc.select("div.xpns").select("div.ptctainer").select("div.ptitle").select("p.ptypead").select("span.rtext");
这篇关于提取内部元素而不循环的文章就介绍到这了,希望我们推荐的答案对大家有所帮助,也希望大家多多支持IT屋!