BeautifulSoup 中 get_text() 的建议 [英] Suggestions on get_text() in BeautifulSoup

查看：20 发布时间：2021/12/23 20:41:16 python beautifulsoup

本文介绍了BeautifulSoup 中 get_text() 的建议的处理方法，对大家解决问题具有一定的参考价值，需要的朋友们下面随着小编来一起学习吧！

我正在使用 BeautifulSoup 来解析 html 页面中的一些内容.

I am using BeautifulSoup to parse some content from a html page.

我可以从 html 中提取我想要的内容(即包含在 class myclass 定义的 span 中的文本).

I can extract from the html the content I want (i.e. the text contained in a span defined by the class myclass).

result = mycontent.find(attrs={'class':'myclass'})

我得到了这个结果:

<span class="myclass">Lorem ipsum<br/>dolor sit amet,<br/>consectetur...</span>

如果我尝试使用以下方法提取文本:

If I try to extract the text using:

result.get_text()

我得到:

Lorem ipsumdolor sit amet,consectetur...

正如你看到的，当标签被移除时，内容之间没有更多的间距，两个词被连接起来.

As you can see when the tag <br> is removed there is no more spacing between the contents and two words are concated.

我该如何解决这个问题?

How can I solve this issue?