使用Jsoup解析Html内容 [英] Parsing Html content using Jsoup

查看：98 发布时间：2018/6/19 15:15:13 java android html parsing jsoup

本文介绍了使用Jsoup解析Html内容的处理方法，对大家解决问题具有一定的参考价值，需要的朋友们下面随着小编来一起学习吧！

问题描述

这是我的HTML源代码

 < li> 
< a href =/ info / some1>项目1< br> 
< span class =deets> 111< / span> 
< / a> 
< / li> 
 
< li> 
< a href =/ info / some2>第2项< br> 
< span class =deets> 222< / span> 
< / a> 
< / li> 
 
< li> 
< a href =/ info / some3>项目3< br> 
< span class =deets> 333< / a> 
< / li>

这是我的Java程序来获取内容并过滤HTML标签

 尝试{
 myurl =新网址（http://www.somewebsite.com）; 
 HttpURLConnection con =（HttpURLConnection）myurl.openConnection（）; 
 
 InputStream result = con.getInputStream（）; 
 BufferedReader reader = new BufferedReader（new InputStreamReader（result））; 
 StringBuilder sb = new StringBuilder（）; 
 
 for（String line;（line = reader.readLine（））！= null;）
 //追加所有内容&使用行分隔符分隔
 sb.append（line）.append（System.getProperty（line.separator））; 
 String final_result = sb.toString（）。replaceAll（\\< ;。*？\ \\>中 ）; 
 
 TextView的TV =（TextView的）findViewById（R.id.textView1）; 
 tv.setText（final_result）; 
 
 
 
 $ b catch（Exception e）{
 // TODO自动生成的catch块
 e.printStackTrace（）; 
 tv.setText（not working）; 
 
 $ / code>

 
  有没有更简单的方法Jsoup使用Java来解析HTML内容而不是正则表达式
 
 有没有办法只获取所需的内容。所以这里我只想要内容Item 2  -  222
 < li> 
< a href =/ info / some2>项目2< br> 
< span class =deets> 222< / a>

解决
  //解析html页面
 document doc = Jsoup.connect（http://www.website.com）.get（）; 
 Document doc1 = Jsoup.parse（< html>< head>< title> First < / head>+< body>< p>解析HTML到文档。< / p>< / body>< / html>）; 
 
字符串内容= doc.body（）.text（）; 
 
 //获取特定元素，如链接
元素链接= doc.select（a [href] ）; 
 for（Element e：links）{
 System.out.println（link：+ e.attr（abs：href））; 
} 
  
要了解更多信息，请访问 Jsoup文档 
 
This is my HTML source
             <li>
                 <a href="/info/some1>Item 1<br>
                    <span class="deets">111</span>
                 </a>
             </li>

             <li>
                 <a href="/info/some2>Item 2<br>
                    <span class="deets">222</span>
                 </a>
             </li>

             <li>
                 <a href="/info/some3>Item 3<br>
                    <span class="deets">333</span>
                 </a>
             </li>
This is my Java program to get the content & it filters the HTML tags
    try {   
        myurl = new URL("http://www.somewebsite.com");  
        HttpURLConnection con= (HttpURLConnection) myurl.openConnection();

        InputStream result = con.getInputStream();
        BufferedReader reader = new BufferedReader(new InputStreamReader(result));
        StringBuilder sb = new StringBuilder();

        for(String line; (line = reader.readLine()) != null;)
            //append all content & separate using line separator
        sb.append(line).append(System.getProperty("line.separator"));
        String final_result = sb.toString().replaceAll("\\<.*?\\>", "");    

        TextView tv=(TextView) findViewById(R.id.textView1); 
        tv.setText(final_result);


    } 

    catch (Exception e) {
        // TODO Auto-generated catch block
        e.printStackTrace();
        tv.setText("not working");
    }



Is there an easier way using Jsoup to parse the HTML content using Java instead of Regex
Is there a way to get only the required contents. So here I just want the contents "Item 2 - 222"
         <li>
             <a href="/info/some2>Item 2<br>
                <span class="deets">222</span>
             </a>
         </li>


 解决方案 
Try this for easy parsing using jsoup:
// To parse the html page
Document doc = Jsoup.connect("http://www.website.com").get();
Document doc1 = Jsoup.parse("<html><head><title>First parse</title></head>" + "<body> <p>Parsed HTML into a doc.</p></body></html>");

String content = doc.body().text();

// To get specific elements such as links
Element links = doc.select("a[href]");
for(Element e: links){
    System.out.println("link: " + e.attr("abs:href"));
}
To learn more, visit Jsoup Docs

                        这篇关于使用Jsoup解析Html内容的文章就介绍到这了，希望我们推荐的答案对大家有所帮助，也希望大家多多支持IT屋！


                    
                        查看全文

使用Jsoup解析Html内容 [英] Parsing Html content using Jsoup

问题描述

相关文章

Java开发最新文章

热门教程

热门工具

登录关闭

使用Jsoup解析Html内容 [英] Parsing Html content using Jsoup

问题描述

相关文章

Java开发最新文章

热门教程

热门工具

登录 关闭

登录关闭