对BeautifulSoup中的get_text()的建议 [英] Suggestions on get_text() in BeautifulSoup

查看：249 发布时间：2020/9/20 6:13:17 python beautifulsoup

本文介绍了对BeautifulSoup中的get_text()的建议的处理方法，对大家解决问题具有一定的参考价值，需要的朋友们下面随着小编来一起学习吧！

我正在使用BeautifulSoup解析html页面中的某些内容.

I am using BeautifulSoup to parse some content from a html page.

我可以从html中提取所需的内容(即，由class myclass定义的span中包含的文本).

I can extract from the html the content I want (i.e. the text contained in a span defined by the class myclass).

result = mycontent.find(attrs={'class':'myclass'})

我得到这个结果:

<span class="myclass">Lorem ipsum<br/>dolor sit amet,<br/>consectetur...</span>

如果我尝试使用以下方法提取文本:

If I try to extract the text using:

result.get_text()

我获得:

Lorem ipsumdolor sit amet,consectetur...

如您所见，删除标签<br>时，内容之间不再有空格，两个单词也被隐藏.

As you can see when the tag <br> is removed there is no more spacing between the contents and two words are concated.

我该如何解决这个问题?

How can I solve this issue?