使用Python在HTML标记中查找数据 [英] Find data within HTML tags using Python

查看：44 发布时间：2021/4/15 19:13:40 python web-scraping beautifulsoup

本文介绍了使用Python在HTML标记中查找数据的处理方法，对大家解决问题具有一定的参考价值，需要的朋友们下面随着小编来一起学习吧！

问题描述

我尝试从网站上抓取以下HTML代码:

I have the following HTML code I am trying to scrape from a website:

<td>Net Taxes Due<td>
<td class="value-column">$2,370.00</td>
<td class="value-column">$2,408.00</td>

我要完成的工作是搜索页面，以在标签内找到文本"Net Taxes Due"，找到标签的同级并将结果发送到Pandas数据框中.

What I am trying to accomplish is to search the page to find the text "Net Taxes Due" within the tag, find the siblings of the tag, and send the results into a Pandas data frame.

我有以下代码:

soup = BeautifulSoup(url, "html.parser")
table = soup.select('#Net Taxes Due')

cells = table.find_next_siblings('td')
cells = [ele.text.strip() for ele in cells]

df = pd.DataFrame(np.array(cells))

print(df)

我到网上都在寻找解决方案，却想不出什么办法.感谢任何帮助.

I've been all over the web looking for a solution and can't come up with something. Appreciate any help.

谢谢！

推荐答案

请确保添加标签名称以及搜索字符串.这是您可以这样做的方式:

Make sure to add the tag name along with your search string. This is how you can do that:

from bs4 import BeautifulSoup

htmldoc = """
<tr>
    <td>Net Taxes Due</td>
    <td class="value-column">$2,370.00</td>
    <td class="value-column">$2,408.00</td>
</tr>
"""    
soup = BeautifulSoup(htmldoc, "html.parser")
item = soup.find('td',text='Net Taxes Due').find_next_sibling("td")
print(item)

这篇关于使用Python在HTML标记中查找数据的文章就介绍到这了，希望我们推荐的答案对大家有所帮助，也希望大家多多支持IT屋！

查看全文

使用Python在HTML标记中查找数据 [英] Find data within HTML tags using Python

问题描述

推荐答案

相关文章

Python最新文章

热门教程

热门工具

登录关闭

使用Python在HTML标记中查找数据 [英] Find data within HTML tags using Python

问题描述

推荐答案

相关文章

Python最新文章

热门教程

热门工具

登录 关闭

登录关闭