“线程"main"中的异常". java.lang.NullPointerException";运行Web刮板程序时出现错误 [英] "Exception in thread "main" java.lang.NullPointerException" error when running web scraper program

查看:157
本文介绍了“线程"main"中的异常". java.lang.NullPointerException";运行Web刮板程序时出现错误的处理方法,对大家解决问题具有一定的参考价值,需要的朋友们下面随着小编来一起学习吧!

问题描述

我对Web抓取还很陌生,并且对Java的了解有限.

I'm fairly new to web scraping and have limited knowledge on Java.

每次我运行此代码时,都会收到错误消息:

Every time I run this code, I get the error:

Exception in thread "main" java.lang.NullPointerException

    at sws.SWS.scrapeTopic(SWS.java:38)
    at sws.SWS.main(SWS.java:26)
Java Result: 1
BUILD SUCCESSFUL (total time: 0 seconds)

我的代码是:

import java.io.*;

import java.net.*;

import org.jsoup.Jsoup;

import org.jsoup.nodes.Document;

public class SWS
{

    /**
     * @param args the command line arguments
     */
    public static void main(String[] args)
    {
        scrapeTopic("wiki/Python");

    }

    public static void scrapeTopic(String url)
    {
        String html = getUrl("http://www.wikipedia.org/" + url);



        Document doc = Jsoup.parse(html);

        String contentText = doc.select("#mw-content-text > p").first().text();

        System.out.println(contentText);


    }


    public static String getUrl(String Url)
    {
        URL urlObj = null;

        try 
        {
            urlObj = new URL(Url);

        }

        catch(MalformedURLException e)
        {
            System.out.println("The url was malformed");

            return "";
        }


        URLConnection urlCon = null;

        BufferedReader in = null;

        String outputText = "";

        try
        {
            urlCon = urlObj.openConnection();

            in = new BufferedReader(new InputStreamReader(urlCon.getInputStream()));
             String line = "";

             while ((line = in.readLine()) != null)
             {
                 outputText += line;

             }

             in.close();
        }

         catch(IOException e)
         {
             System.out.println("There was a problem connecting to the url");

             return "";

         }

        return outputText;



    }

}

我已经盯着屏幕看了一段时间,需要帮助!

I've been staring at my screen for sometime now and in need of help!

谢谢.

推荐答案

在以下代码中:

 String contentText = doc.select("#mw-content-text > p").first().text()

如果doc.select("#mw-content-text > p")找不到与查询匹配的任何元素,并返回 empty 元素,并在该元素上调用first(),则会给出NullPointerException.

If doc.select("#mw-content-text > p") doesn't find any element that match the query and returns an empty element calling first() on such element should give a NullPointerException.

检查 Element.select 的jsoup文档页面a>和 Elements.first()

check the jsoup document page of Element.select and Elements.first()

这篇关于“线程"main"中的异常". java.lang.NullPointerException";运行Web刮板程序时出现错误的文章就介绍到这了,希望我们推荐的答案对大家有所帮助,也希望大家多多支持IT屋!

查看全文
相关文章
登录 关闭
扫码关注1秒登录
发送“验证码”获取 | 15天全站免登陆