如何使用JAVA的HTML页面得到一个表 [英] How to get a table from an html page using JAVA

查看：196 发布时间：2016/6/1 19:45:52 java arrays table html-parsing jsoup

本文介绍了如何使用JAVA的HTML页面得到一个表的处理方法，对大家解决问题具有一定的参考价值，需要的朋友们下面随着小编来一起学习吧！

问题描述

我工作的一个项目，我想从互联网上获取的财务报表，并利用它们在Java应用程序自动创建的比率，和图表。

我使用该网站使用一个登录名和密码才能到表。结果
标签是TBODY，但也有其他2 TBODY在HTML中。

如何使用Java来我的表打印到一个txt文件，我可以再在我的应用程序中使用？
将最好的方式去了解这是什么，以及我应该读了？

解决方案

如果这是我的项目，我会考虑使用一个HTML解析器，像 jsoup （虽然别人都可用）。该jsoup网站有一个教程，并用它玩了一段时间后，你可能会发现它pretty易于使用。

例如，对于像这样一个HTML表格：</ P>

jsoup可以解析它像这样：

 进口java.io.IOException异常;
进口org.jsoup.Jsoup;
进口org.jsoup.nodes.Document;
进口org.jsoup.nodes.Element;
进口org.jsoup.select.Elements;公共类TableEg {
   公共静态无效的主要（字串[] args）{
      字符串的html =http://publib.boulder.ibm.com/infocenter/iadthelp/v7r1/topic/+
            com.ibm.etools.iseries.toolbox.doc / htmtblex.htm
      尝试{
         文档的DOC = Jsoup.connect（HTML）获得（）;
         元素tableElements = doc.select（表）;         元素tableHeaderEles = tableElements.select（THEAD TR日）;
         的System.out.println（头）;
         的for（int i = 0; I＆LT; tableHeaderEles.size（）;我++）{
            的System.out.println（tableHeaderEles.get（ⅰ）的.text（））;
         }
         的System.out.println（）;         元素tableRowElements = tableElements.select（：不是（THEAD）TR）;         的for（int i = 0; I＆LT; tableRowElements.size（）;我++）{
            牙列= tableRowElements.get（ⅰ）;
            的System.out.println（行）;
            元素rowItems = row.select（TD）;
            对于（INT J = 0; J＆LT; rowItems.size（）; J ++）{
               的System.out.println（rowItems.get（J）的.text（））;
            }
            的System.out.println（）;
         }      }赶上（IOException异常五）{
         e.printStackTrace（）;
      }
   }
}

在下面的输出所得的：

 头
帐户
名称
平衡行
0000001
的customer1
100.00行
0000002
CUSTOMER2
200.00行
0000003
Customer3
550.00

I am working on a project where I am trying to fetch financial statements from the internet and use them in a JAVA application to automatically create ratios, and charts.

The site I am using uses a login and password to get to the tables.
The Tag is TBODY, but there are 2 other TBODY's in the html.

How can I use java to print my table to a txt file where I can then use in my application? What would the best way to go about this, and what should I read up on?

解决方案

If this were my project, I'd look into using an HTML parser, something like jsoup (although others are available). The jsoup site has a tutorial, and after playing with it a while, you'll likely find it pretty easy to use.

For example, for an HTML table like so:

jsoup could parse it like so:

import java.io.IOException;
import org.jsoup.Jsoup;
import org.jsoup.nodes.Document;
import org.jsoup.nodes.Element;
import org.jsoup.select.Elements;

public class TableEg {
   public static void main(String[] args) {
      String html = "http://publib.boulder.ibm.com/infocenter/iadthelp/v7r1/topic/" +
            "com.ibm.etools.iseries.toolbox.doc/htmtblex.htm";
      try {
         Document doc = Jsoup.connect(html).get();
         Elements tableElements = doc.select("table");

         Elements tableHeaderEles = tableElements.select("thead tr th");
         System.out.println("headers");
         for (int i = 0; i < tableHeaderEles.size(); i++) {
            System.out.println(tableHeaderEles.get(i).text());
         }
         System.out.println();

         Elements tableRowElements = tableElements.select(":not(thead) tr");

         for (int i = 0; i < tableRowElements.size(); i++) {
            Element row = tableRowElements.get(i);
            System.out.println("row");
            Elements rowItems = row.select("td");
            for (int j = 0; j < rowItems.size(); j++) {
               System.out.println(rowItems.get(j).text());
            }
            System.out.println();
         }

      } catch (IOException e) {
         e.printStackTrace();
      }
   }
}

Resulting in the following output:

headers
ACCOUNT
NAME
BALANCE

row
0000001
Customer1
100.00

row
0000002
Customer2
200.00

row
0000003
Customer3
550.00

这篇关于如何使用JAVA的HTML页面得到一个表的文章就介绍到这了，希望我们推荐的答案对大家有所帮助，也希望大家多多支持IT屋！

查看全文

如何使用JAVA的HTML页面得到一个表 [英] How to get a table from an html page using JAVA

问题描述

相关文章

Java开发最新文章

热门教程

热门工具

登录关闭

如何使用JAVA的HTML页面得到一个表 [英] How to get a table from an html page using JAVA

问题描述

相关文章

Java开发最新文章

热门教程

热门工具

登录 关闭

登录关闭