提取内部元素而不循环 [英] Extract inner element without looping

查看:73
本文介绍了提取内部元素而不循环的处理方法,对大家解决问题具有一定的参考价值,需要的朋友们下面随着小编来一起学习吧!

问题描述

如果我遍历所有 ,并在第一个之后立即中断,请从以下示例HTML代码中提取href值:

 < li class =parts partname parts_first> 
< div id =dpdn10uri =/ public / page / part1class =partype partstate>
< div class =ptctainer>
< div class =ptitle>
< p class =ptypead>
< span class =rtext>< a href =http://www.example.com/page/ptname.html?dv=rfirstclass =mnLabel> First<一个>< /跨度>
< span class =ndx>
< a href =#dndx =dpdn10class =xpnd _tstyle =opacity:1>详情:< / a>
< / span>
< / p>
< / div>
< / div>

< div id =dpdn10_contentclass =xpns>
< div class =ptctainer>
< div class =ptitle>
< p class =ptypead>
< span class =rtext>< a href =http://www.example.com/page/ptname.html?dv=rfirstclass =mnLabel> First<一个>< /跨度>
< span class =ndx>< a href =#class =xpnd>详情:< / a>< / span>
< / p>
< / div>
< / div>
< / div>
< / div>
< / li>

当我可以假设href值对于 ,如上面的例子。



但是,如果这些方法不相同,并且我想提取特定的一个(第一个或第二个)。

这让我在Jsoup中寻找一个允许嵌套选择的机制:直到现在我已经熟悉单层选择,如下所示:

/ p>

 元素链接= doc.select(a [href]); // a with href 
元素pngs = doc.select(img [src $ =。png]); // img with src ending .png
Element masthead = doc.select(div.masthead)。first(); // div with class = masthead

但是我找不到多层次的文档或示例选择,例如

 元素链接= doc.select(div.xpns.div.ptctainer.div.ptitle.p.ptypead .span.rtext); 

当然,以上仅用于说明而非实际的语法。我不知道在Jsoup中是否有这样的可能(还)。



Jsoup中存在这种嵌套选择吗?

解决方案

你难道不能'链接'选择功能吗?喜欢:

 元素链接= doc.select(div.xpns)。select(div.ptctainer)。select ( div.ptitle)选择( p.ptypead)选择( span.rtext); 


Extracting the href value from the following sample HTML code is straight forward if I loop through all and break immediately after the first one:

  <li class="parts partname parts_first">
    <div id="dpdn10" uri="/public/page/part1" class="partype partstate">
      <div class="ptctainer">
        <div class="ptitle">
          <p class="ptypead">
            <span class="rtext"><a href="http://www.example.com/page/ptname.html?dv=rfirst" class="mnLabel">First</a></span>
            <span class="ndx">
              <a href="#" dndx="dpdn10" class="xpnd _t" style="opacity:1">Details: </a>
            </span>
          </p>
        </div>
      </div>

      <div id="dpdn10_content" class="xpns">
        <div class="ptctainer">
          <div class="ptitle">
            <p class="ptypead">
              <span class="rtext"><a href="http://www.example.com/page/ptname.html?dv=rfirst" class="mnLabel">First</a></span>
              <span class="ndx"><a href="#" class="xpnd">Details: </a></span>
            </p>
          </div>
        </div>    
      </div>
    </div>
  </li>

I can certainly do that when I can assume the href value is identical for both instances of as in the example above.

However, this approach fails if they are not identical and I want to extract a specific one (either the first or the second).

Which brings me to searching for a mechanism in Jsoup that allows "nested selection": Up until now I have been familiar with single-level selection as in:

Elements links = doc.select("a[href]"); // a with href
Elements pngs = doc.select("img[src$=.png]");  // img with src ending .png
Element masthead = doc.select("div.masthead").first();  // div with class=masthead

But I can't find documentation or an example for multi-level selection, e.g.

Element link= doc.select("div.xpns.div.ptctainer.div.ptitle.p.ptypead.span.rtext");

The above is for illustration and not real syntax, of course. I don't know if something like this is possible (yet) in Jsoup.

Does such "nested selection" exist in Jsoup?

解决方案

Can't you just 'chain' the selection functions? Like:

Element link = doc.select("div.xpns").select("div.ptctainer").select("div.ptitle").select("p.ptypead").select("span.rtext");

这篇关于提取内部元素而不循环的文章就介绍到这了,希望我们推荐的答案对大家有所帮助,也希望大家多多支持IT屋!

查看全文
登录 关闭
扫码关注1秒登录
发送“验证码”获取 | 15天全站免登陆