Beautiful Soup 为特定 div 找到孩子 [英] Beautiful Soup find children for particular div
问题描述
我正在尝试使用 Python->Beautiful Soup 解析一个看起来像这样的网页:
I have am trying to parse a webpage that looks like this with Python->Beautiful Soup:
我正在尝试提取突出显示的 td div 的内容.目前我可以通过
I am trying to extract the contents of the highlighted td div. Currently I can get all the divs by
alltd = soup.findAll('td')
for td in alltd:
print td
但我试图缩小范围以搜索tablebox"类中的 tds这仍然可能会返回 30+,但比 300+ 更易于管理.
But I am trying to narrow the scope of that to search the tds in the class "tablebox" which still will probably return 30+ but is more managable a number than 300+.
如何提取上图中突出显示的 td 的内容?
How can I extract the contents of the highlighted td in picture above?
推荐答案
BeautifulSoup 在一个元素中找到的任何元素仍然具有与该父元素相同的类型 - 即可以调用各种方法,这很有用.
It is useful to know that whatever elements BeautifulSoup finds within one element still have the same type as that parent element - that is, various methods can be called.
所以这是您的示例的一些工作代码:
So this is somewhat working code for your example:
soup = BeautifulSoup(html)
divTag = soup.find_all("div", {"class": "tablebox"})
for tag in divTag:
tdTags = tag.find_all("td", {"class": "align-right"})
for tag in tdTags:
print tag.text
这将打印所有具有align-right"类的 td
标签的所有文本;具有父 div
类的tablebox".
This will print all the text of all the td
tags with the class of "align-right" that have a parent div
with the class of "tablebox".
这篇关于Beautiful Soup 为特定 div 找到孩子的文章就介绍到这了,希望我们推荐的答案对大家有所帮助,也希望大家多多支持IT屋!