如何使用Jsoup从html提取段落文本? [英] How to extract text of paragraph from html using Jsoup?
本文介绍了如何使用Jsoup从html提取段落文本?的处理方法,对大家解决问题具有一定的参考价值,需要的朋友们下面随着小编来一起学习吧!
问题描述
import java.io.IOException;
import java.util.logging.Level;
import java.util.logging.Logger;
import org.jsoup.Jsoup;
import org.jsoup.nodes.Document;
import org.jsoup.nodes.Element;
import org.jsoup.select.Elements;
public class JavaApplication14 {
public static void main(String[] args) {
try {
Document doc = Jsoup.connect("tanmoy_mahathir.makes.org/thimble/146").get();
String html= "<html><head></head>" + "<body><p>Parsed HTML into a doc."
+ "</p></body></html>";
Elements paragraphs = doc.select("p");
for(Element p : paragraphs)
System.out.println(p.text());
} catch (IOException ex) {
Logger.getLogger(JavaApplication14.class.getName()).log(Level.SEVERE, null, ex);
}
}
}
有人可以用jsoup代码帮助我如何解析仅包含段落的部分,以便仅打印
can anyone help me with jsoup code how can i parse just portion including paragraph so that just print
Hello ,World!
Nothing is impossible
推荐答案
对于这小段html,您只需要做
For this small bit of html you just need to do
String html= "<html><head></head>" + "<body><p>Parsed HTML into a doc."+
+"</p></body></html>";
Document doc = Jsoup.parse(html);
Elements paragraphs = doc.select("p");
for(Element p : paragraphs)
System.out.println(p.text());
我看到您的链接包含几乎相同的html,然后您也可以将doc
的定义替换为
As I see your link contains pretty much the same html you could then also replace the definition of doc
with
Document doc = Jsoup.connect("https://tanmoy_mahathir.makes.org/thimble/146").get();
更新
这是完整的代码,可以编译并很好地运行.
Here is the full code that compiles and runs fine for me.
import java.io.IOException;
import java.util.logging.*;
import org.jsoup.*;
import org.jsoup.nodes.*;
import org.jsoup.select.*;
public class JavaApplication14 {
public static void main(String[] args) {
try {
String url = "https://tanmoy_mahathir.makes.org/thimble/146";
Document doc = Jsoup.connect(url).get();
Elements paragraphs = doc.select("p");
for(Element p : paragraphs)
System.out.println(p.text());
}
catch (IOException ex) {
Logger.getLogger(JavaApplication14.class.getName())
.log(Level.SEVERE, null, ex);
}
}
}
这篇关于如何使用Jsoup从html提取段落文本?的文章就介绍到这了,希望我们推荐的答案对大家有所帮助,也希望大家多多支持IT屋!
查看全文