如何转换 HTML <table>到二维数组 [英] How to convert HTML &lt;table&gt; to a 2D array

查看:37
本文介绍了如何转换 HTML <table>到二维数组的处理方法,对大家解决问题具有一定的参考价值,需要的朋友们下面随着小编来一起学习吧!

问题描述


假设我复制了一个完整的 HTML 表(当每个 trtd 都有额外的属性时)成一个字符串.如何获取所有内容(标签之间的内容)并创建一个像原始表格一样组织的二维数组?


Lets say I copy a complete HTML table (when each and every tr and td has extra attributes) into a String. How can I take all the contents (what is between the tags) and create an 2D array that is organized like the original table?

例如对于这个表:

<table border="1">
    <tr align= "center">
        <td align="char">TD1</td>
        <td>td1</td>
        <td align="char">TD1</td>
        <td>td1</td>
    </tr>
    <tr>
        <td>TD2</td>
        <td>tD2</td>
        <td class="bold>Td2</td>
        <td>td2</td>
    </tr>
</table>

我想要这个数组:

PS:我知道我可以使用正则表达式,但它会非常复杂.我想要一个像 JSoup 这样的工具,它可以自动完成所有工作,而无需编写太多代码

PS: I know I can use regex but it would be extremely complicated. I want a tool like JSoup that can do all the work automatically without much code writing

推荐答案

这就是使用 JSoup 的方法 (srsly, don'不要对 HTML 使用正则表达式).

This is how it could be done using JSoup (srsly, don't use regexp for HTML).

Document doc = Jsoup.parse(html);
Elements tables = doc.select("table");
for (Element table : tables) {
    Elements trs = table.select("tr");
    String[][] trtd = new String[trs.size()][];
    for (int i = 0; i < trs.size(); i++) {
        Elements tds = trs.get(i).select("td");
        trtd[i] = new String[tds.size()];
        for (int j = 0; j < tds.size(); j++) {
            trtd[i][j] = tds.get(j).text(); 
        }
    }
    // trtd now contains the desired array for this table
}

此外,在您的示例中,class 属性值在此处未正确关闭:

Also, the class attribute value is not closed properly here in your example:

<td class="bold>Td2</td>

应该是

<td class="bold">Td2</td>

这篇关于如何转换 HTML <table>到二维数组的文章就介绍到这了,希望我们推荐的答案对大家有所帮助,也希望大家多多支持IT屋!

查看全文
登录 关闭
扫码关注1秒登录
发送“验证码”获取 | 15天全站免登陆