BeautifulSoup的findAll()给出多个类? [英] BeautifulSoup findAll() given multiple classes?

查看:1085
本文介绍了BeautifulSoup的findAll()给出多个类?的处理方法,对大家解决问题具有一定的参考价值,需要的朋友们下面随着小编来一起学习吧!

问题描述

我想从网站刮项的列表,和preserve的顺序,它们在psented $ P $。这些项在表组​​织的,但它们可以是两个不同的类中的一个(排名不分先后)。

有什么办法来提供多个类,并有BeautifulSoup4找到它在任何给定的班?

所有项目

我要做到这一点code做什么,除了preserve项目的顺序,因为它是在源$ C ​​$ C:

 项目= soup.findAll(真,{'类':'class1的'})
项目+ = soup.findAll(真,{'类':'类class2'})


解决方案

您可以做到这一点。

  soup.findAll(真,{类:1级,类别2]})

例如:

 >>>从BS4进口BeautifulSoup
>>>汤= BeautifulSoup('< HTML><身体GT;< D​​IV CLASS =1级>< / DIV>< D​​IV CLASS =类别2>< / DIV>< D​​IV CLASS =3类>< / DIV>< /身体GT;< / HTML>')
>>> soup.findAll(真,{级:1类,类别2]})
[< D​​IV CLASS =1级>< / DIV>中< D​​IV CLASS =类别2>< / DIV>]

I would like to scrape a list of items from a website, and preserve the order that they are presented in. These items are organized in a table, but they can be one of two different classes (in random order).

Is there any way to provide multiple classes and have BeautifulSoup4 find all items which are in any of the given classes?

I need to achieve what this code does, except preserve the order of items as it was in the source code:

items = soup.findAll(True,{'class':'class1'})
items += soup.findAll(True,{'class':'class2'})

解决方案

you can do this

soup.findAll(True, {'class':['class1', 'class2']})

example:

>>> from bs4 import BeautifulSoup
>>> soup = BeautifulSoup('<html><body><div class="class1"></div><div class="class2"></div><div class="class3"></div></body></html>')
>>> soup.findAll(True, {"class":["class1", "class2"]})
[<div class="class1"></div>, <div class="class2"></div>]

这篇关于BeautifulSoup的findAll()给出多个类?的文章就介绍到这了,希望我们推荐的答案对大家有所帮助,也希望大家多多支持IT屋!

查看全文
登录 关闭
扫码关注1秒登录
发送“验证码”获取 | 15天全站免登陆