将带有标头的HTML表转换为Json-Python [英] Convert HTML table with a header to Json - Python

查看：63 发布时间：2021/2/13 20:19:50 python html json

本文介绍了将带有标头的HTML表转换为Json-Python的处理方法，对大家解决问题具有一定的参考价值，需要的朋友们下面随着小编来一起学习吧！

问题描述

假设我有以下HTML表:

Suppose I have the following HTML table:

<table>
  <tr>
    <th>Name</th>
    <th>Age</th>
    <th>License</th>
    <th>Amount</th>
  </tr>
  <tr>
    <td>John</td>
    <td>28</td>
    <td>Y</td>
    <td>12.30</td>
  </tr>
  <tr>
    <td>Kevin</td>
    <td>25</td>
    <td>Y</td>
    <td>22.30</td>
  </tr>
  <tr>
    <td>Smith</td>
    <td>38</td>
    <td>Y</td>
    <td>52.20</td>
  </tr>
  <tr>
    <td>Stewart</td>
    <td>21</td>
    <td>N</td>
    <td>3.80</td>
  </tr>
</table>

我想将此表转换为JSON，可能采用以下格式:

I'd like to convert this table to JSON, potentially in the following format:

data= [
  { 
    Name: 'John',         
    Age: 28,
    License: 'Y',
    Amount: 12.30
  },
  { 
    Name: 'Kevin',         
    Age: 25,
    License: 'Y',
    Amount: 22.30
  },
  { 
    Name: 'Smith',         
    Age: 38,
    License: 'Y',
    Amount: 52.20
  },
  { 
    Name: 'Stewart',         
    Age: 21,
    License: 'N',
    Amount: 3.80
  }
];

我已经看到了另一个执行上述操作的示例，我在此处找到了. 但是，鉴于该答案，有些事情我无法解决.这些是:

I've seen another example that sort of does the above, which I found here. However, there are a couple of things that I can't get working given that answer. Those are:

它仅限于表中的两行.如果添加另一行，则会出现错误:

print(json.dumps(OrderedDict(table_data)))ValueError:值太多打开包装(预期2)

print(json.dumps(OrderedDict(table_data))) ValueError: too many values to unpack (expected 2)

不考虑表的标题行.

到目前为止，这是我的代码:

This is my code so far:

html_data = """
<table>
  <tr>
    <th>Name</th>
    <th>Age</th>
    <th>License</th>
    <th>Amount</th>
  </tr>
  <tr>
    <td>John</td>
    <td>28</td>
    <td>Y</td>
    <td>12.30</td>
  </tr>
  <tr>
    <td>Kevin</td>
    <td>25</td>
    <td>Y</td>
    <td>22.30</td>
  </tr>
  <tr>
    <td>Smith</td>
    <td>38</td>
    <td>Y</td>
    <td>52.20</td>
  </tr>
  <tr>
    <td>Stewart</td>
    <td>21</td>
    <td>N</td>
    <td>3.80</td>
  </tr>
</table>
"""

from bs4 import BeautifulSoup
from collections import OrderedDict
import json

table_data = [[cell.text for cell in row("td")]
                         for row in BeautifulSoup(html_data, features="lxml")("tr")]

print(json.dumps(OrderedDict(table_data)))

但是我遇到了以下错误:

But I'm getting the following error:

print(json.dumps(OrderedDict(table_data)))ValueError:需要更多 0个要解压的值

print(json.dumps(OrderedDict(table_data))) ValueError: need more than 0 values to unpack

编辑如果HTML中只有一个表，则下面的答案非常适用.如果有两个表怎么办?例如:

EDIT The answer below works perfectly if there is only one table in the HTML. What if there are two tables? For example:

<html>
    <body>
        <h1>My Heading</h1>
        <p>Hello world</p>
        <table>
            <tr>
                <th>Name</th>
                <th>Age</th>
                <th>License</th>
                <th>Amount</th>
            </tr>
            <tr>
                <td>John</td>
                <td>28</td>
                <td>Y</td>
                <td>12.30</td>
            </tr>
            <tr>
                <td>Kevin</td>
                <td>25</td>
                <td>Y</td>
                <td>22.30</td>
            </tr>
            <tr>
                <td>Smith</td>
                <td>38</td>
                <td>Y</td>
                <td>52.20</td>
            </tr>
            <tr>
                <td>Stewart</td>
                <td>21</td>
                <td>N</td>
                <td>3.80</td>
            </tr>
        </table>
        <table>
            <tr>
                <th>Name</th>
                <th>Age</th>
                <th>License</th>
                <th>Amount</th>
            </tr>
            <tr>
                <td>Rich</td>
                <td>28</td>
                <td>Y</td>
                <td>12.30</td>
            </tr>
            <tr>
                <td>Kevin</td>
                <td>25</td>
                <td>Y</td>
                <td>22.30</td>
            </tr>
            <tr>
                <td>Smith</td>
                <td>38</td>
                <td>Y</td>
                <td>52.20</td>
            </tr>
            <tr>
                <td>Stewart</td>
                <td>21</td>
                <td>N</td>
                <td>3.80</td>
            </tr>
        </table>
    </body>
</html>

如果将其插入下面的代码中，则仅第一个表显示为JSON输出.

If I plug this in the below code, only the first table is shown as the JSON output.

将带有标头的HTML表转换为Json-Python [英] Convert HTML table with a header to Json - Python

问题描述

推荐答案

相关文章

前端开发最新文章

热门教程

热门工具

登录关闭

将带有标头的HTML表转换为Json-Python [英] Convert HTML table with a header to Json - Python

问题描述

推荐答案

相关文章

前端开发最新文章

热门教程

热门工具

登录 关闭

登录关闭