如何找到只有某些属性标签 - BeautifulSoup [英] How to find tags with only certain attributes - BeautifulSoup
问题描述
我将如何使用BeautifulSoup,搜索仅包含我搜索的属性标签?
How would I, using BeautifulSoup, search for tags containing ONLY the attributes I search for?
例如,我要找到所有< TD VALIGN =顶>
标记。
For example, I want to find all <td valign="top">
tags.
以下code: raw_card_data = soup.fetch('TD',{'VALIGN':re.compile('顶')})
得到所有我想要的数据,而且还抓住任何&LT; TD&GT;
标记具有属性 VALIGN:顶部
gets all of the data I want, but also grabs any <td>
tag that has the attribute valign:top
我也试过: raw_card_data = soup.findAll(re.compile('&LT; TD VALIGN =顶&GT;'))
这没有返回值(可能因为糟糕的正则表达式)
I also tried:
raw_card_data = soup.findAll(re.compile('<td valign="top">'))
and this returns nothing (probably because of bad regex)
我不知道是否有在BeautifulSoup的方式说找到&LT; TD&GT;其唯一的属性
标签是 VALIGN:顶部
I was wondering if there was a way in BeautifulSoup to say "Find <td>
tags whose only attribute is valign:top
"
更新
例如,如果一个HTML文档包含以下&LT; TD&GT;
标签:
<td valign="top">.....</td><br />
<td width="580" valign="top">.......</td><br />
<td>.....</td><br />
我想只有第一个&LT; TD&GT;
标记(&LT; TD WIDTH =580VALIGN =顶&GT;
)返回
推荐答案
作为对<一个解释href=\"http://www.crummy.com/software/BeautifulSoup/documentation.html#The%20basic%20find%20method%3a%20findAll%28name,%20attrs,%20recursive,%20text,%20limit,%20%2a%2akwargs%29\">BeutifulSoup文档
您可以使用这样的:
soup = BeautifulSoup(html)
results = soup.findAll("td", {"valign" : "top"})
编辑:
要返回都只有VALIGN =顶属性标签,您可以检查标签的长度 ATTRS
属性:
To return tags that have only the valign="top" attribute, you can check for the length of the tag attrs
property :
from BeautifulSoup import BeautifulSoup
html = '<td valign="top">.....</td>\
<td width="580" valign="top">.......</td>\
<td>.....</td>'
soup = BeautifulSoup(html)
results = soup.findAll("td", {"valign" : "top"})
for result in results :
if len(result.attrs) == 1 :
print result
返回:
<td valign="top">.....</td>
这篇关于如何找到只有某些属性标签 - BeautifulSoup的文章就介绍到这了,希望我们推荐的答案对大家有所帮助,也希望大家多多支持IT屋!