使用BeautifulSoup查找具有两种特定样式的标签 [英] Using BeautifulSoup to find tag with two specific styles

查看:701
本文介绍了使用BeautifulSoup查找具有两种特定样式的标签的处理方法,对大家解决问题具有一定的参考价值,需要的朋友们下面随着小编来一起学习吧!

问题描述

我正在尝试使用Python2.7中的BeautifulSoup(bs4)包在html文档中找到以下标记:

I am trying to use the BeautifulSoup (bs4) package in Python2.7 to find the following tag in an html document:

<div style="position:absolute; border: textbox 1px solid; writing-mode:lr-tb; left:408px; top:540px; width:14px; height:9px;"><span style="font-family: OEULZL+ArialMT; font-size:9px">0.00<br></span></div>

在html文档中,还有多个其他标签几乎完全相同-唯一一致的区别是"left:408px"和"height:9px"属性.

In the html document there are multiple other tags that are almost exactly identical - the only consistently difference is the "left:408px" and the "height:9px" attributes.

如何使用BeautifulSoup查找此标签?

我尝试了以下方法:

from bs4 import BeautifulSoup as bs

soup = bs("<div style="position:absolute; border: textbox 1px solid; writing-mode:lr-tb; left:408px; top:540px; width:14px; height:9px;"><span style="font-family: OEULZL+ArialMT; font-size:9px">0.00<br></span></div>", 'html.parser')

soup.find_all('div', style=('left:408px' and 'height:9px'))
soup.find_all('div', style=('left:408px') and style=('height:9px')) #doesn't like style being used twice
soup.find_all('div', {'left':'408px' and 'height':'9px'})
soup.find_all('div', {'left:408px'} and {'height:9px'})
soup.find_all('div', style={'left':'408px' and 'height':'9px'})
soup.find_all('div', style={'left:408px'} and {'height:9px'})

有什么想法吗?

推荐答案

您可以检查style中是否包含left:408pxheight:9px:

You can check the style to have left:408px and height:9px inside it:

soup.find('div', style=lambda value: value and 'left:408px' in value and 'height:9px' in value)

或者:

import re
soup.find('div', style=re.compile(r'left:408px.*?height:9px'))

或者:

soup.select_one('div[style*="408px"]')

请注意,通常,样式属性不能可靠地用于定位元素.看看是否还有其他内容-检查父级,同级元素,或者该元素附近是否有相应的标签.

Note that, in general, style properties are not reliable to use for locating elements. See if there is anything else - check the parent, sibling elements, or may be there is a corresponding label near the element.

请注意,更合适的CSS选择器将是div[style*="left:408px"][style*="height:9px"],但是由于有限的CSS选择器支持此错误,它无法按原样工作.

Note that, a more appropriate CSS selector would be div[style*="left:408px"][style*="height:9px"], but because of the limited CSS selector support and this bug, it is not gonna work as is.

这篇关于使用BeautifulSoup查找具有两种特定样式的标签的文章就介绍到这了,希望我们推荐的答案对大家有所帮助,也希望大家多多支持IT屋!

查看全文
登录 关闭
扫码关注1秒登录
发送“验证码”获取 | 15天全站免登陆