如何在 BeautifulSoup.contents 中保留空格 [英] How do I keep whitespace in BeautifulSoup.contents
问题描述
我在网上找到的大多数示例都展示了如何删除空格 - 但就我而言,我需要保留它..我有
html = "我可以用一只手翻转这整个东西
<span>D#m</span>
头目
<span>A#</span> <span>Dm<;/span> <span>A#</span>
我知道~~~~事实上,你宁愿拥有一些我来代替"bs = BeautifulSoup(html, 'html.parser')content = (unicode('').join(unicode(content) for content in bs.contents))
我希望保留空格(html"变量包含 pre 标签的内容)——但它似乎用一个空格替换了多个空格.
如何保存/获取给定的 beautifulsoup 解析器的原始内容?
html 解析器似乎只在您解析的内容位于 <pre> 标签中时才保留空格——在我的例子中,pre 标签已被删除.添加
html = ""+ html + "</pre>"
保留了空格.
Most examples I find online show how to remove whitespace - but in my case I need to keep it.. I have
html = "I can flip this whole thing with one hand
<span>D#m</span>
The ringleader man
<span>A#</span> <span>Dm</span> <span>A#</span>
I know~~~~ it's a fact that you'd rather just have some of me instead"
bs = BeautifulSoup(html, 'html.parser')
content = (unicode('').join(unicode(content) for content in bs.contents))
Which I expect to keep the whitespace (the "html" variable contains the contents of a pre tag) -- but it seems to replace multiple spaces with a single space.
How do I keep/get the raw contents of a given beautifulsoup parser?
The html parser seems to only keeps whitespace if the content you are parsing is in a <pre> tag -- in my case, the pre tag was removed. Adding
html = "<pre>" + html + "</pre>"
preserved the whitespace.
这篇关于如何在 BeautifulSoup.contents 中保留空格的文章就介绍到这了,希望我们推荐的答案对大家有所帮助,也希望大家多多支持IT屋!