如何使用Jsoup从html提取段落文本? [英] How to extract text of paragraph from html using Jsoup?

查看：152 发布时间：2020/4/24 9:55:35 jsoup

本文介绍了如何使用Jsoup从html提取段落文本?的处理方法，对大家解决问题具有一定的参考价值，需要的朋友们下面随着小编来一起学习吧！

问题描述

import java.io.IOException;
import java.util.logging.Level;
import java.util.logging.Logger;
import org.jsoup.Jsoup;
import org.jsoup.nodes.Document;
import org.jsoup.nodes.Element;
import org.jsoup.select.Elements;

public class JavaApplication14 {


public static void main(String[] args)  {
    try {
        Document doc = Jsoup.connect("tanmoy_mahathir.makes.org/thimble/146").get();  
         String html= "<html><head></head>" + "<body><p>Parsed HTML into a doc."
                 + "</p></body></html>"; 
  Elements paragraphs = doc.select("p");
  for(Element p : paragraphs)
    System.out.println(p.text());
    } catch (IOException ex) {
        Logger.getLogger(JavaApplication14.class.getName()).log(Level.SEVERE, null, ex);
    }
}

}

有人可以用jsoup代码帮助我如何解析仅包含段落的部分，以便仅打印

can anyone help me with jsoup code how can i parse just portion including paragraph so that just print

Hello ,World!
Nothing is impossible

推荐答案

对于这小段html，您只需要做

For this small bit of html you just need to do

String html= "<html><head></head>" + "<body><p>Parsed HTML into a doc."+
                    +"</p></body></html>"; 
Document doc = Jsoup.parse(html); 
Elements paragraphs = doc.select("p");
for(Element p : paragraphs)
  System.out.println(p.text());

我看到您的链接包含几乎相同的html，然后您也可以将doc的定义替换为

As I see your link contains pretty much the same html you could then also replace the definition of doc with

Document doc = Jsoup.connect("https://tanmoy_mahathir.makes.org/thimble/146").get();

更新

这是完整的代码，可以编译并很好地运行.

Here is the full code that compiles and runs fine for me.

import java.io.IOException;
import java.util.logging.*;
import org.jsoup.*;
import org.jsoup.nodes.*;
import org.jsoup.select.*;

public class JavaApplication14 {

  public static void main(String[] args)  {
    try {
      String url = "https://tanmoy_mahathir.makes.org/thimble/146";
      Document doc = Jsoup.connect(url).get();
      Elements paragraphs = doc.select("p");
      for(Element p : paragraphs)
        System.out.println(p.text());
    } 
    catch (IOException ex) {
      Logger.getLogger(JavaApplication14.class.getName())
            .log(Level.SEVERE, null, ex);
    }
  }
}

这篇关于如何使用Jsoup从html提取段落文本?的文章就介绍到这了，希望我们推荐的答案对大家有所帮助，也希望大家多多支持IT屋！

查看全文

如何使用Jsoup从html提取段落文本? [英] How to extract text of paragraph from html using Jsoup?

问题描述

推荐答案

相关文章

其他开发最新文章

热门教程

热门工具

登录关闭

如何使用Jsoup从html提取段落文本? [英] How to extract text of paragraph from html using Jsoup?

问题描述

推荐答案

相关文章

其他开发最新文章

热门教程

热门工具

登录 关闭

登录关闭