BeautifulSoup 部分 div 类匹配 [英] BeautifulSoup partial div class matching

查看:21
本文介绍了BeautifulSoup 部分 div 类匹配的处理方法,对大家解决问题具有一定的参考价值,需要的朋友们下面随着小编来一起学习吧!

问题描述

我需要通过抓取从 Github 获取里程碑信息.里程碑信息嵌入在两种类型的 div 类中:table-list-item 里程碑未到期table-list-item 里程碑.

I need to fetch milestone information from Github by scraping. The milestone information is embedded in 2 types of div classes: table-list-item milestone notdue and table-list-item milestone.

如何检索两个类中包含的信息?

How can I retrieve the information contained in both classes?

我有:milestones = soup.find_all('div', {'class': 'table-list-item里程碑'})但这行返回 table-list-item 里程碑 notdue

现在我正在做以下事情(丑陋的黑客):

Right now I am doing the following (ugly hack):

milestones = soup.find_all('div', {'class':'table-list-item milestone'})
milestones.extend(soup.findAll('div', {'class': 'table-list-item milestone notdue'}))

有什么优雅的解决方案吗?

Is there any elegant solution for this?

根据 this 问题,BeautifulSoup 应该返回所有匹配的.我的问题正好相反!

As per this question, BeautifulSoup is supposed to return all matching ones. My issue is exactly opposite!

推荐答案

soup.find_all('div', {'class': 'milestone'})

或使用 CSS 选择器:

or use CSS selector:

soup.select('.milestone')

在bs4中,class是多值属性:

in bs4, class is Multi-valued attributes:

它存储在列表中:[table-list-item,milestone,notdue] 和 [table-list-item,milestone]

你需要做的是找到共享的价值,比如里程碑

what you need to do is find the shared value,like milestone

这篇关于BeautifulSoup 部分 div 类匹配的文章就介绍到这了,希望我们推荐的答案对大家有所帮助,也希望大家多多支持IT屋!

查看全文
登录 关闭
扫码关注1秒登录
发送“验证码”获取 | 15天全站免登陆