如何使用Java从网页中读取文本？ [英] How to read a text from a web page with Java?

查看：128 发布时间：2018/12/6 14:12:52 java

本文介绍了如何使用Java从网页中读取文本？的处理方法，对大家解决问题具有一定的参考价值，需要的朋友们下面随着小编来一起学习吧！

问题描述

我想从网页上阅读文字。我不想获取网页的HTML代码。我找到了这段代码：

I want to read the text from a web page. I don't want to get the web page's HTML code. I found this code:

    try {
        // Create a URL for the desired page
        URL url = new URL("http://www.uefa.com/uefa/aboutuefa/organisation/congress/news/newsid=1772321.html#uefa+moving+with+tide+history");       

        // Read all the text returned by the server
        BufferedReader in = new BufferedReader(new InputStreamReader(url.openStream()));
        String str;
        while ((str = in.readLine()) != null) {
            str = in.readLine().toString();
            System.out.println(str);
            // str is one line of text; readLine() strips the newline character(s)
        }
        in.close();
    } catch (MalformedURLException e) {
    } catch (IOException e) {
    }

但是这段代码给了我网页的HTML代码。我想在此页面中获取整个文本。我怎么能用Java做这个？

but this code gives me the HTML code of the web page. I want to get the whole text inside this page. How can I do this with Java?

推荐答案

你可能想看看 jsoup 为此：

String html = "<p>An <a href='http://example.com/'><b>example</b></a> link.</p>";
Document doc = Jsoup.parse(html); 
String text = doc.body().text(); // "An example link"

此示例摘自其网站上的一个。

This example is an extract from one on their site.

这篇关于如何使用Java从网页中读取文本？的文章就介绍到这了，希望我们推荐的答案对大家有所帮助，也希望大家多多支持IT屋！

查看全文

如何使用Java从网页中读取文本？ [英] How to read a text from a web page with Java?

问题描述

推荐答案

相关文章

Java开发最新文章

热门教程

热门工具

登录关闭

如何使用Java从网页中读取文本？ [英] How to read a text from a web page with Java?

问题描述

推荐答案

相关文章

Java开发最新文章

热门教程

热门工具

登录 关闭

登录关闭