使用jsoup从HTML解析表格 [英] Parse a table from HTML using jsoup
本文介绍了使用jsoup从HTML解析表格的处理方法,对大家解决问题具有一定的参考价值,需要的朋友们下面随着小编来一起学习吧!
问题描述
< table class =scripture>
< tbody>
< tr>
< td class =versevalign =top>
< a name =2:1>< / a>< a class =vershref =javascript:getParallel('LUK',2,1); title =Klik om grondtekst en SV te zien>& nbsp; 1& nbsp;< / a>
< / td>
< td class =content>
< span class =main>在日历中记录日期和时间,以及记录日期和时间的日期。< / span>
< / td>
< / tr>
< / tbody>
< / table>
< table class =scripture>
< tbody>
< tr>
< td class =versevalign =top>
< a name =2:2>< / a>< a class =vershref =javascript:getParallel('LUK',2,2); title =Klik om grondtekst en SV te zien>& nbsp; 2& nbsp;< / a>
< / td>
< td class =content>
< span class =main> Deze eerste inschrijving vond plaats toen Cyrenius overSyriëstadhouder was。< / span>
< / td>
< / tr>
< / tbody>
< / table>
这与我在 link ,但我想获得经文和圣经内容。我怎么做到这一点?
到目前为止,这是我试过的:
Element table = doc.select(table [class = scripture])。first();
Log.e(BB,passage1:+ table.ownText());
但它不显示任何内容。任何帮助,将不胜感激。谢谢。
解决方案
假设您想获取与本身包含 2:2
,你可以这样做:
String verse =2:2 ;
//位于类圣经表中的类main的范围
//包含一个类verse的td,其中的属性名是verse
的值链接p = doc .select(
String.format(table.scripture:has(td.verse a [name =%s])span.main,verse)
).first();
System.out.println(p.text());
输出:
Deze eerste inschrijving vond plaats toen Cyrenius overSyriëstadhouder was。
I've got another problem with scraping html text. Here's the sample of what I'm trying to extract from:
<table class="scripture">
<tbody>
<tr>
<td class="verse" valign="top">
<a name="2:1"></a><a class="vers" href="javascript:getParallel('LUK', 2, 1);" title="Klik om grondtekst en SV te zien"> 1 </a>
</td>
<td class="content">
<span class="main">En het geschiedde in die dagen dat er een gebod uitging van keizer Augustus dat heel de wereld ingeschreven moest worden.</span>
</td>
</tr>
</tbody>
</table>
<table class="scripture">
<tbody>
<tr>
<td class="verse" valign="top">
<a name="2:2"></a><a class="vers" href="javascript:getParallel('LUK', 2, 2);" title="Klik om grondtekst en SV te zien"> 2 </a>
</td>
<td class="content">
<span class="main">Deze eerste inschrijving vond plaats toen Cyrenius over Syrië stadhouder was.</span>
</td>
</tr>
</tbody>
</table>
This is similar to my problem in this link but I want to get the verse text and the Scripture content. How do I achieve this?
So far this is what I've tried:
Element table = doc.select("table[class=scripture]").first();
Log.e("BB", "passage1: " + table.ownText());
But it doesn't display anything. Any help would be appreciated. Thanks.
解决方案
Assuming that you want to get the span's content corresponding to the table that itself contains the verse 2:2
, you can do it with:
String verse = "2:2";
// The span of class main located inside the table of class scripture
// that contains a td of class verse with a link whose attribute name is the value of verse
Element p = doc.select(
String.format("table.scripture:has(td.verse a[name=%s]) span.main", verse)
).first();
System.out.println(p.text());
Output:
Deze eerste inschrijving vond plaats toen Cyrenius over Syrië stadhouder was.
这篇关于使用jsoup从HTML解析表格的文章就介绍到这了,希望我们推荐的答案对大家有所帮助,也希望大家多多支持IT屋!
查看全文