自定义缩进宽度BeautifulSoup。prettify() [英] Custom indent width for BeautifulSoup .prettify()
问题描述
有没有办法来定义。prettify()
函数的自定义缩进宽度是多少?从我可以从它那里得到的源 -
Is there any way to define custom indent width for .prettify()
function? From what I can get from it's source -
def prettify(self, encoding=None, formatter="minimal"):
if encoding is None:
return self.decode(True, formatter=formatter)
else:
return self.encode(encoding, True, formatter=formatter)
有没有办法指定缩进宽度。我想这是因为该行的德code_contents()
功能 -
There is no way to specify indent width. I think it's because of this line in the decode_contents()
function -
s.append(" " * (indent_level - 1))
其中有1空间的固定长度! (为什么!)我试图指定 indent_level = 4
,这只是导致了这一点 -
Which has a fixed length of 1 space! (WHY!!) I tried specifying indent_level=4
, that just results in this -
<section>
<article>
<h1>
</h1>
<p>
</p>
</article>
</section>
这看上去只是普通的愚蠢。 :|
Which looks just plain stupid. :|
现在,我可以破解这个了,但我只是想确保,如果有什么我失踪。因为这应该是一个基本特征。 : - /
Now, I can hack this away, but I just want to be sure if there is anything I'm missing. Because this should be a basic feature. :-/
如果你有prettifying HTML codeS一些更好的办法,让我知道。
If you have some better way of prettifying HTML codes, let me know.
推荐答案
其实我处理这个自己,以尽可能hackiest方式:通过后期处理的结果。
I actually dealt with this myself, in the hackiest way possible: by post-processing the result.
r = re.compile(r'^(\s*)', re.MULTILINE)
def prettify_2space(s, encoding=None, formatter="minimal"):
return r.sub(r'\1\1', s.prettify(encoding, formatter))
其实,我在类的地方 prettify
的monkeypatched prettify_2space
。这是对解决方案不是必需的,但让我们做吧,使缩进宽度参数,而不是它硬编码到2:
Actually, I monkeypatched prettify_2space
in place of prettify
in the class. That's not essential to the solution, but let's do it anyway, and make the indent width a parameter instead of hardcoding it to 2:
orig_prettify = bs4.BeautifulSoup.prettify
r = re.compile(r'^(\s*)', re.MULTILINE)
def prettify(self, encoding=None, formatter="minimal", indent_width=4):
return r.sub(r'\1' * indent_width, orig_prettify(self, encoding, formatter))
bs4.BeautifulSoup.prettify = prettify
所以:
x = '''<section><article><h1></h1><p></p></article></section>'''
soup = bs4.BeautifulSoup(x)
print(soup.prettify(indent_width=3))
...给出了:
… gives:
<html>
<body>
<section>
<article>
<h1>
</h1>
<p>
</p>
</article>
</section>
</body>
</html>
显然,如果要修补标签。prettify
以及 BeautifulSoup。prettify
,你必须做同样的事情在那里。 (您可能要创建一个可以适用于,而不是重复自己一个通用包装器。)如果有任何其他 prettify
方法,同样的协议。
Obviously if you want to patch Tag.prettify
as well as BeautifulSoup.prettify
, you have to do the same thing there. (You might want to create a generic wrapper that you can apply to both, instead of repeating yourself.) And if there are any other prettify
methods, same deal.
这篇关于自定义缩进宽度BeautifulSoup。prettify()的文章就介绍到这了,希望我们推荐的答案对大家有所帮助,也希望大家多多支持IT屋!