BeautifulSoup，从HTML表格词典 [英] BeautifulSoup, a dictionary from an HTML table

查看：138 发布时间：2016/8/5 18:58:11 python beautifulsoup

本文介绍了BeautifulSoup，从HTML表格词典的处理方法，对大家解决问题具有一定的参考价值，需要的朋友们下面随着小编来一起学习吧！

问题描述

我试图从一个网站凑表中的数据。

I am trying to scrape table data from a website.

下面是一个简单的示例表：

Here is a simple example table:

t = '<html><table>' +\
    '<tr><td class="label"> a </td> <td> 1 </td></tr>' +\
    '<tr><td class="label"> b </td> <td> 2 </td></tr>' +\
    '<tr><td class="label"> c </td> <td> 3 </td></tr>' +\
    '<tr><td class="label"> d </td> <td> 4 </td></tr>' +\
    '</table></html>'

期望解析结果是 {'一'：'1'，'B'：'2'，'C'：'3'，'D'：'4'}

这是我最亲密的尝试至今：

This is my closest attempt so far:

for tr in s.findAll('tr'):
  k, v = BeautifulSoup(str(tr)).findAll('td')
  d[str(k)] = str(v)

的结果是：

{'<td class="label"> a </td>': '<td> 1 </td>', '<td class="label"> d </td>': '<td> 4 </td>', '<td class="label"> b </td>': '<td> 2 </td>', '<td class="label"> c </td>': '<td> 3 </td>'}

我知道的findAll（）的文本= TRUE 参数，但我没有得到预期的结果，当我使用它。

I'm aware of the text=True parameter of findAll() but I'm not getting the expected results when I use it.

我使用python 2.6和BeautifulSoup3。

I'm using python 2.6 and BeautifulSoup3.

推荐答案

试试这个：

from BeautifulSoup import BeautifulSoup, Comment

t = '<html><table>' +\
    '<tr><td class="label"> a </td> <td> 1 </td></tr>' +\
    '<tr><td class="label"> b </td> <td> 2 </td></tr>' +\
    '<tr><td class="label"> c </td> <td> 3 </td></tr>' +\
    '<tr><td class="label"> d </td> <td> 4 </td></tr>' +\
    '</table></html>'

bs = BeautifulSoup(t)

results = {}
for row in bs.findAll('tr'):
    aux = row.findAll('td')
    results[aux[0].string] = aux[1].string

print results

这篇关于BeautifulSoup，从HTML表格词典的文章就介绍到这了，希望我们推荐的答案对大家有所帮助，也希望大家多多支持IT屋！

查看全文

BeautifulSoup，从HTML表格词典 [英] BeautifulSoup, a dictionary from an HTML table

问题描述

推荐答案

相关文章

Python最新文章

热门教程

热门工具

登录关闭

BeautifulSoup，从HTML表格词典 [英] BeautifulSoup, a dictionary from an HTML table

问题描述

推荐答案

相关文章

Python最新文章

热门教程

热门工具

登录 关闭

登录关闭