Jsoup-从< a>中提取数据标签,在< td>内标签 [英] Jsoup - extracting data from an <a> tag, inside a <td> tag

查看:201
本文介绍了Jsoup-从< a>中提取数据标签,在< td>内标签的处理方法,对大家解决问题具有一定的参考价值,需要的朋友们下面随着小编来一起学习吧!

问题描述

我想使用Jsoup从网站提取数据.数据在表格中.

I want to extract data from a Web site, using Jsoup. The data are in a table.

HTML代码:

<table><tr><td><a href="......">Pop.Density</a></td>
           <td>123</td></tr></table>

我要打印:

zip code...(taken from a text file): 123

我有以下例外情况:

Exception in thread "main" java.lang.NullPointerException

任何帮助将不胜感激.谢谢!

Any help would be appreciated. Thank you!

这是我的代码:

String s = br.readLine();
String str="http://www.bestplaces.net/people/zip-code/illinois/"+s;

org.jsoup.Connection conn = Jsoup.connect(str);
conn.timeout(1800000); 
Document doc = conn.get();

for (Element table : doc.select("table"))
{
    for (Element row : table.select("tr")) 
    {
    Elements tds = row.select("td");
    if (tds.size() > 1)
    {
        Element link = tds.get(0).select("a").first();
        String linkText = link.text();

        if (link.text().contains("Pop.Density"))
            System.out.println(s+","+tds.get(1).text());
        }
    }
}

更新: 如果我修改了最后一个if():

UPDATE: If I modify the last if():

if (tds.get(0).select("a").text().contains("Pop.Density"))

我没有任何例外,但也没有任何输出.

I do not have any exceptions, but no output either.

推荐答案

假定共享的html不是真正使用的html,我认为当第一个TD没有<a>标记时,它会抛出异常.我认为您需要更新

Assuming the shared html is not the real one being used, I think its throwing the exception when first TD doesn't have <a> tag. I think you need to update

  if (tds.size() > 1) 

  if (tds.size() > 1 && tds.get(0).select("a") != null 
                          && tds.get(0).select("a").first() ! null)

如果不是这种情况,则共享NullPointerException来源的行号可以帮助更好地找到解决方案.

If this is not the case, sharing the line number of NullPointerException origin can help better finding the solution.

这篇关于Jsoup-从&lt; a&gt;中提取数据标签,在&lt; td&gt;内标签的文章就介绍到这了,希望我们推荐的答案对大家有所帮助,也希望大家多多支持IT屋!

查看全文
登录 关闭
扫码关注1秒登录
发送“验证码”获取 | 15天全站免登陆