使用纯Python代码去除生成的HTML中的空格 [英] Strip whitespace in generated HTML using pure Python code
问题描述
我正在使用Jinja2生成通常非常大的HTML文件.我注意到生成的HTML有很多空白.有没有可以用来最小化此HTML的纯Python工具?当我说最小化"时,是指从HTML中删除不必要的空格(就像Google一样-例如,查看google.com的源代码)
I am using Jinja2 to generate HTML files which are typically very huge in size. I noticed that the generated HTML had a lot of whitespace. Is there a pure-Python tool that I can use to minimize this HTML? When I say "minimize", I mean remove unnecessary whitespace from the HTML (much like Google does -- look at the source for google.com, for instance)
为此,我不想依靠诸如整洁的库/外部可执行文件.
I don't want to rely on libraries/external-executables such as tidy for this.
为进一步说明,实际上没有JavaScript代码.仅HTML内容.
For further clarification, there is virtually no JavaScript code. Only HTML content.
推荐答案
如果您只想摆脱多余的空格,可以使用:
If you just want to get rid of excess whitespace, you can use:
>>> import re
>>> html_string = re.sub(r'\s\s+', ' ', html_string)
或:
>>> html_string = ' '.join(html_string.split())
如果您想做的事情不仅仅是剥离多余的空格,还需要使用更强大的工具(或更复杂的正则表达式).
If you want to do something more complicated than just stripping excess whitespace, you'll need to use more powerful tools (or more complex regexps).
这篇关于使用纯Python代码去除生成的HTML中的空格的文章就介绍到这了,希望我们推荐的答案对大家有所帮助,也希望大家多多支持IT屋!