从内嵌样式中删除高度和宽度 [英] Remove height and width from inline styles
问题描述
我使用BeautifulSoup从我的元素中删除内联高度和宽度。解决它的图像很简单:
def remove_dimension_tags(tag):
for [width,高度]:
del标签[属性]
返回标签
'不知道如何处理这样的事情:
< div id =attachment_9565class =wp- caption aligncenterstyle =width:2010px; background-color:red>
当我想要保留背景色(例如)或任何其他样式属性高度或宽度。
我能想到的唯一方法就是使用正则表达式,但是上次我提出这样的想法时,StackOverflow的精神从我的计算机中出来,出生。
完整的步骤是:
from bs4 import BeautifulSoup
import re
string =
< div id =attachment_9565class =wp-caption
< p>
< p / gt;
< p>>
< p> < / p>
< / div>
#寻找宽度或高度,然后不是a;
rx = re.compile(r'(?:width | height):[^;] + ;?'')
soup = BeautifulSoup(string,html5lib)
for soup.findAll('div'):
div ['style'] = rx.sub(,string)
正如其他人所说的,在实际值上使用正则表达式不是问题。
I'm using BeautifulSoup to remove inline heights and widths from my elements. Solving it for images was simple:
def remove_dimension_tags(tag):
for attribute in ["width", "height"]:
del tag[attribute]
return tag
But I'm not sure how to go about processing something like this:
<div id="attachment_9565" class="wp-caption aligncenter" style="width: 2010px;background-color:red">
when I would want to leave the background-color (for example) or any other style attributes other than height or width.
The only way I can think of doing it is with a regex but last time I suggested something like that the spirit of StackOverflow came out of my computer and murdered my first-born.
A full walk-through would be:
from bs4 import BeautifulSoup
import re
string = """
<div id="attachment_9565" class="wp-caption aligncenter" style="width: 2010px;background-color:red">
<p>Some line here</p>
<hr/>
<p>Some other beautiful text over here</p>
</div>
"""
# look for width or height, followed by not a ;
rx = re.compile(r'(?:width|height):[^;]+;?')
soup = BeautifulSoup(string, "html5lib")
for div in soup.findAll('div'):
div['style'] = rx.sub("", string)
As stated by others, using regular expressions on the actual value is not a problem.
这篇关于从内嵌样式中删除高度和宽度的文章就介绍到这了,希望我们推荐的答案对大家有所帮助,也希望大家多多支持IT屋!