BeautifulSoup 部分 div 类匹配 [英] BeautifulSoup partial div class matching
问题描述
我需要通过抓取从 Github 获取里程碑信息.里程碑信息嵌入在两种类型的 div 类中:table-list-item 里程碑未到期
和 table-list-item 里程碑
.
I need to fetch milestone information from Github by scraping.
The milestone information is embedded in 2 types of div classes:
table-list-item milestone notdue
and table-list-item milestone
.
如何检索两个类中包含的信息?
How can I retrieve the information contained in both classes?
我有:milestones = soup.find_all('div', {'class': 'table-list-item里程碑'})
但这行返回 table-list-item 里程碑 notdue
现在我正在做以下事情(丑陋的黑客):
Right now I am doing the following (ugly hack):
milestones = soup.find_all('div', {'class':'table-list-item milestone'})
milestones.extend(soup.findAll('div', {'class': 'table-list-item milestone notdue'}))
有什么优雅的解决方案吗?
Is there any elegant solution for this?
根据 this 问题,BeautifulSoup 应该返回所有匹配的.我的问题正好相反!
As per this question, BeautifulSoup is supposed to return all matching ones. My issue is exactly opposite!
推荐答案
soup.find_all('div', {'class': 'milestone'})
或使用 CSS 选择器:
or use CSS selector:
soup.select('.milestone')
在bs4中,class
是多值属性:
in bs4, class
is Multi-valued attributes:
它存储在列表中:[table-list-item,milestone,notdue] 和 [table-list-item,milestone]
你需要做的是找到共享的价值,比如里程碑
what you need to do is find the shared value,like milestone
这篇关于BeautifulSoup 部分 div 类匹配的文章就介绍到这了,希望我们推荐的答案对大家有所帮助,也希望大家多多支持IT屋!