使用BeautifulSoup提取两个节点之间的同级节点 [英] Use BeautifulSoup to extract sibling nodes between two nodes

查看：111 发布时间：2021/4/15 19:08:35 python beautifulsoup

本文介绍了使用BeautifulSoup提取两个节点之间的同级节点的处理方法，对大家解决问题具有一定的参考价值，需要的朋友们下面随着小编来一起学习吧！

问题描述

我有一个这样的文件:

<p class="top">I don't want this</p>

<p>I want this</p>
<table>
   <!-- ... -->
</table>

<img ... />

<p> and all that stuff too</p>

<p class="end>But not this and nothing after it</p>

我想提取p [class = top]和p [class = end]段落之间的所有内容.

I want to extract everything between the p[class=top] and p[class=end] paragraphs.

使用BeautifulSoup可以做到这一点吗?

Is there a nice way I can do this with BeautifulSoup?

推荐答案

node.nextSibling 属性是您的解决方案:

node.nextSibling attribute is your solution:

from BeautifulSoup import BeautifulSoup

soup = BeautifulSoup(html)

nextNode = soup.find('p', {'class': 'top'})
while True:
    # process
    nextNode = nextNode.nextSibling
    if getattr(nextNode, 'name', None)  == 'p' and nextNode.get('class', None) == 'end':
        break

这个复杂的条件是确保您正在访问HTML标记的属性而不是字符串节点.

This complicated condition is to be sure that you're accessing attributes of HTML tag and not string nodes.

这篇关于使用BeautifulSoup提取两个节点之间的同级节点的文章就介绍到这了，希望我们推荐的答案对大家有所帮助，也希望大家多多支持IT屋！

查看全文

使用BeautifulSoup提取两个节点之间的同级节点 [英] Use BeautifulSoup to extract sibling nodes between two nodes

问题描述

推荐答案

相关文章

Python最新文章

热门教程

热门工具

登录关闭

使用BeautifulSoup提取两个节点之间的同级节点 [英] Use BeautifulSoup to extract sibling nodes between two nodes

问题描述

推荐答案

相关文章

Python最新文章

热门教程

热门工具

登录 关闭

登录关闭