如何使用Python美丽的汤只得到1级navigableText? [英] How to use python beautiful soup to get only the level 1 navigableText?
本文介绍了如何使用Python美丽的汤只得到1级navigableText?的处理方法,对大家解决问题具有一定的参考价值,需要的朋友们下面随着小编来一起学习吧!
问题描述
我是用美丽的汤从这个例子HTML code中的文本:
I am using beautiful soup to get the text from this example html code:
....
<div style="s1">
<div style="s2">Here is text 1</div>
<div style="s3">Here is text 2</div>
Here is text 3 and this is what I want.
</div>
....
文本1和文本2是在同一水平2和3文本在上一级1.我只想要得到的文本3和使用这样的:
Text 1 and text 2 is at the same level 2 and the text 3 is at the upper level 1. I only want to get the text 3 and used this:
for anchor in tbody.findAll('div', style="s1"):
review=anchor.text
print review
但这些code让我所有的文字1,2,3。我怎么只得到了第一级的文本3?
But these code get me all the text 1,2,3. How do I only get the first level text 3?
推荐答案
是这样的:
for anchor in tbody.findAll('div', style="s1"):
text = ''.join([x for x in anchor.contents if isinstance(x, bs4.element.NavigableString)])
工作。只要知道你还可以在那里得到的换行符,所以 .strip()
荷兰国际集团可能是必要的。
works. Just know that you'll also get the line breaks in there, so .strip()
ing might be necessary.
例如:
for anchor in tbody.findAll('div', style="s1"):
text = ''.join([x for x in anchor.contents if isinstance(x, bs4.element.NavigableString)])
print([text])
print([text.strip()])
打印
[u'\n\n\nHere is text 3 and this is what I want.\n']
[u'Here is text 3 and this is what I want.']
(我把它们放在列表,所以你可以看到新行)。
(I put them in lists so you could see the newlines.)
这篇关于如何使用Python美丽的汤只得到1级navigableText?的文章就介绍到这了,希望我们推荐的答案对大家有所帮助,也希望大家多多支持IT屋!
查看全文