BeautifulSoup和&安培; NBSP; [英] BeautifulSoup and  

查看:113
本文介绍了BeautifulSoup和&安培; NBSP;的处理方法,对大家解决问题具有一定的参考价值,需要的朋友们下面随着小编来一起学习吧!

问题描述

我的code:

html = "<tag>&nbsp;</tag>"
from bs4 import BeautifulSoup
print BeautifulSoup(html).renderContents()

输出:

<tag> </tag>

所需的输出:

<tag>&nbsp;</tag>

BeautifulSoup似乎取代了我的破空间的HTML逃跑了UNI code字符,这意味着同样的事情。但是,这并不让所有的方式通过我的系统,并最终成为一个不换空间,因此不会做我想要的。有没有办法告诉BeautifulSoup不这样做呢?

BeautifulSoup seems to be replaced my breaking space html escape with a unicode character that means the same thing. But that doesn't make it all the way through my system and ends up becoming a non-breaking space and thus not doing what I wanted. Is there a way to tell BeautifulSoup not to do that?

推荐答案

使用连接code_contents 而不是 renderContents 连接code prettify 。它们都支持 格式 论证,并通过HTML的格式:

Use encode_contents instead of renderContents, or encode or prettify. They all support the formatter argument, and pass 'html' as formatter:

html = "<tag>&nbsp;</tag>"
from bs4 import BeautifulSoup
print BeautifulSoup(html).encode_contents(formatter='html')

生产:

<tag>&nbsp;</tag>

这篇关于BeautifulSoup和&安培; NBSP;的文章就介绍到这了,希望我们推荐的答案对大家有所帮助,也希望大家多多支持IT屋!

查看全文
登录 关闭
扫码关注1秒登录
发送“验证码”获取 | 15天全站免登陆