BeautifulSoup和搜索按类 [英] BeautifulSoup and Searching By Class
问题描述
可能重复:结果
<一href=\"http://stackoverflow.com/questions/1242755/beautiful-soup-cannot-find-a-css-class-if-the-object-has-other-classes-too\">Beautiful汤找不到CSS类,如果对象具有其他类,也
我使用BeautifulSoup来查找HTML 表
。我目前运行到的问题是在类
属性的使用空间。如果我的HTML读&LT; HTML&GT;&LT;表类=wikitable排序&GT;等等&LT; /表&gt;&LT; HTML&GT /;
,我似乎无法提取它具有以下(其中我是能够找到表
既维基百科
和维基百科排序
为类
)
I'm using BeautifulSoup to find tables
in the HTML. The problem I am currently running into is the use of spaces in the class
attribute. If my HTML reads <html><table class="wikitable sortable">blah</table></html>
, I can't seem to extract it with the following (where I was to be able to find tables
with both wikipedia
and wikipedia sortable
for the class
):
BeautifulSoup(html).findAll(attrs={'class':re.compile("wikitable( sortable)?")})
这会发现表,如果我的HTML只是&LT; HTML&GT;&LT;表类=wikitable&GT;等等&LT; /表&gt;&LT; / HTML&GT;
虽然。同样,我已经尝试使用wikitable可排序
在我的正常前pression,而且也不会匹配。任何想法?
This will find the table if my HTML is just <html><table class="wikitable">blah</table></html>
though. Likewise, I have tried using "wikitable sortable"
in my regular expression, and that won't match either. Any ideas?
推荐答案
模式匹配,如果 wikitable
也将失败后会出现另一个CSS类,如
,所以如果你想,它的类属性包含在类中的所有表 wikitable
,你需要一个模式,它接受类=wikitable以外的东西更多的可能性:
The pattern match will also fail if wikitable
appears after another CSS class, as in class="something wikitable other"
, so if you want all tables whose class attribute contains the class wikitable
, you need a pattern that accepts more possibilities:
html = '''<html><table class="sortable wikitable other">blah</table>
<table class="wikitable sortable">blah</table>
<table class="wikitable"><blah></table></html>'''
tree = BeautifulSoup(html)
for node in tree.findAll(attrs={'class': re.compile(r".*\bwikitable\b.*")}):
print node
结果:
<table class="sortable wikitable other">blah</table>
<table class="wikitable sortable">blah</table>
<table class="wikitable"><blah></blah></table>
只是为了记录在案,我不使用BeautifulSoup,和preFER使用 LXML ,正如其他人提及。
这篇关于BeautifulSoup和搜索按类的文章就介绍到这了,希望我们推荐的答案对大家有所帮助,也希望大家多多支持IT屋!