Python:如何将html打印到文件中 [英] Python : How to Pretty print html into a file

查看:460
本文介绍了Python:如何将html打印到文件中的处理方法,对大家解决问题具有一定的参考价值,需要的朋友们下面随着小编来一起学习吧!

问题描述

我使用 lxml.html 来生成一些HTML。我想漂亮地打印(缩进)我的最终结果到一个html文件。我是如何做到的?

I am using lxml.html to generate some HTML. I want to pretty print (with indentation) my final result into an html file. How do I do that?

这是我尝试过的并且到现在为止(我对Python和lxml比较新):

This is what I have tried and got till now (I am relatively new to Python and lxml) :

import lxml.html as lh
from lxml.html import builder as E
sliderRoot=lh.Element("div", E.CLASS("scroll"), style="overflow-x: hidden; overflow-y: hidden;")
scrollContainer=lh.Element("div", E.CLASS("scrollContainer"), style="width: 4340px;")
sliderRoot.append(scrollContainer)
print lh.tostring(sliderRoot, pretty_print = True, method="html")

正如您所看到的,我正在使用 pretty_print = True 属性。我认为这会给缩进的代码,但它并没有真正的帮助。这是输出:

As you can see I am using the pretty_print=True attribute. I thought that would give indented code, but it doesn't really help. This is the output :

< div style =overflow-x:hidden; overflow-y:hidden; class =scroll>< div style =width:4340px; class =scrollContainer>< / div>< / div>

<div style="overflow-x: hidden; overflow-y: hidden;" class="scroll"><div style="width: 4340px;" class="scrollContainer"></div></div>

推荐答案

I直接使用 BeautifulSoup 。这是 lxml.html.soupparser 用于解析HTML的内容。

I ended up using BeautifulSoup directly. That is something lxml.html.soupparser uses for parsing HTML.

BeautifulSoup有一种美化方法,它完全按照它所说的做。它用适当的缩进和所有东西来美化HTML。

BeautifulSoup has a prettify method that does exactly what it says it does. It prettifies the HTML with proper indents and everything.

BeautifulSoup不会修复HTML,因此破解的代码仍然会被破坏。但在这种情况下,由于代码是由lxml生成的,因此HTML代码应至少在语义上是正确的。

BeautifulSoup will NOT fix the HTML, so broken code, remains broken. But in this case, since the code is being generated by lxml, the HTML code should be at least semantically correct.

在我的问题中给出的示例中,我将拥有要做到这一点:

In the example given in my question, I will have to do this :

from BeautifulSoup import BeautifulSoup as bs
root=lh.tostring(sliderRoot) #convert the generated HTML to a string
soup=bs(root)                #make BeautifulSoup
prettyHTML=soup.prettify()   #prettify the html

这篇关于Python:如何将html打印到文件中的文章就介绍到这了,希望我们推荐的答案对大家有所帮助,也希望大家多多支持IT屋!

查看全文
登录 关闭
扫码关注1秒登录
发送“验证码”获取 | 15天全站免登陆