如何拔出CSS从内嵌样式与BeautifulSoup属性 [英] How to pull out CSS attributes from inline styles with BeautifulSoup

查看：541 发布时间：2016/8/5 18:59:45 python css inline beautifulsoup

本文介绍了如何拔出CSS从内嵌样式与BeautifulSoup属性的处理方法，对大家解决问题具有一定的参考价值，需要的朋友们下面随着小编来一起学习吧！

问题描述

我有这样的事情：

<img style="background:url(/theRealImage.jpg) no-repate 0 0; height:90px; width:92px;") src="notTheRealImage.jpg"/>

我使用beautifulsoup解析HTML。有没有去拉出URL中的背景的CSS属性？

I am using beautifulsoup to parse the html. Is there away to pull out the "url" in the "background" css attribute?

推荐答案

您已经有了一对夫妇的选项 - 快速和肮脏或正道。快速和肮脏的方式（如果标记被改变，这将轻松突破）看起来像

You've got a couple options- quick and dirty or the Right Way. The quick and dirty way (which will break easily if the markup is changed) looks like

>>> from BeautifulSoup import BeautifulSoup
>>> import re
>>> soup = BeautifulSoup('<html><body><img style="background:url(/theRealImage.jpg) no-repate 0 0; height:90px; width:92px;") src="notTheRealImage.jpg"/></body></html>')
>>> style = soup.find('img')['style']
>>> urls = re.findall('url\((.*?)\)', style)
>>> urls
[u'/theRealImage.jpg']

显然，你必须与打得到它与多个合作 IMG 标记。

一条正确的路，因为我会觉得很可怕暗示有人使用正则表达式对CSS串:)，使用CSS解析器。 cssutils ，我只是在谷歌和发现提供PyPI上图书馆，看起来像它可能做的工作。

The Right Way, since I'd feel awful suggesting someone use regex on a CSS string :), uses a CSS parser. cssutils, a library I just found on Google and available on PyPi, looks like it might do the job.

这篇关于如何拔出CSS从内嵌样式与BeautifulSoup属性的文章就介绍到这了，希望我们推荐的答案对大家有所帮助，也希望大家多多支持IT屋！

查看全文

如何拔出CSS从内嵌样式与BeautifulSoup属性 [英] How to pull out CSS attributes from inline styles with BeautifulSoup

问题描述

推荐答案

相关文章

前端开发最新文章

热门教程

热门工具

登录关闭

如何拔出CSS从内嵌样式与BeautifulSoup属性 [英] How to pull out CSS attributes from inline styles with BeautifulSoup

问题描述

推荐答案

相关文章

前端开发最新文章

热门教程

热门工具

登录 关闭

登录关闭