使用 Beautiful Soup 查找下一个出现的标签及其包含的文本 [英] Finding next occurring tag and its enclosed text with Beautiful Soup

查看：33 发布时间：2021/12/23 20:47:35 python html python-2.7 beautifulsoup

本文介绍了使用 Beautiful Soup 查找下一个出现的标签及其包含的文本的处理方法，对大家解决问题具有一定的参考价值，需要的朋友们下面随着小编来一起学习吧！

问题描述

我正在尝试解析标签

之间的文本.当我输入 soup.blockquote.get_text() 时.
对于 HTML 文件中第一个出现的块引用，我得到了我想要的结果.如何在文件中找到下一个连续的
标签?也许我只是累了，在文档中找不到它.
示例 HTML 文件:
<头>头<blockquote>我可以得到这个文本</blockquote>eiaoiefj<blockquote>尝试捕捉下一个</blockquote>不要捕捉这个<blockquote>也捕获这个，但在捕获下一个"之后分开</blockquote>
简单的python代码:
from bs4 import BeautifulSouphtml_doc = open("example.html")汤 = BeautifulSoup(html_doc)打印.(soup.blockquote.get_text())# 如何获得下一个区块引用???
解决方案
使用 find_next_sibling(如果不是兄弟，使用 find_next 代替)
<预><代码>>>>html = '''... <html>... <head>标题... </head>... <blockquote>blah blah... </blockquote>... eiaoiefj... <blockquote>捕获下一个... </blockquote>... 不要捕捉这个... <blockquote>... 也捕获这个，但在捕获下一个"之后分开... </blockquote>... </html>...'''>>>从 bs4 导入 BeautifulSoup>>>汤 = BeautifulSoup(html)>>>quote1 = 汤.blockquote>>>报价1.text你呸呸 '>>>quote2 = quote1.find_next_siblings('blockquote')>>>报价2.text你接下来捕捉这个 '
I'm trying to parse text between the tag <blockquote>. When I type soup.blockquote.get_text().

I get the result I want for the first occurring blockquote in the HTML file. How do I find the next and sequential <blockquote> tag in the file? Maybe I'm just tired and can't find it in the documentation.

Example HTML file:
<html>
<head>header
</head>
<blockquote>I can get this text
</blockquote>
eiaoiefj
<blockquote>trying to capture this next
</blockquote>
do not capture this
<blockquote>
capture this too but separately after "capture this next"
</blockquote>
</html>
the simple python code:
from bs4 import BeautifulSoup

html_doc = open("example.html")
soup = BeautifulSoup(html_doc)
print.(soup.blockquote.get_text())
# how to get the next blockquote???
解决方案
Use find_next_sibling (If it not a sibling, use find_next instead)
>>> html = '''
... <html>
... <head>header
... </head>
... <blockquote>blah blah
... </blockquote>
... eiaoiefj
... <blockquote>capture this next
... </blockquote>
... don'tcapturethis
... <blockquote>
... capture this too but separately after "capture this next"
... </blockquote>
... </html>
... '''

>>> from bs4 import BeautifulSoup
>>> soup = BeautifulSoup(html)
>>> quote1 = soup.blockquote
>>> quote1.text
u'blah blah
'
>>> quote2 = quote1.find_next_siblings('blockquote')
>>> quote2.text
u'capture this next
'
这篇关于使用 Beautiful Soup 查找下一个出现的标签及其包含的文本的文章就介绍到这了，希望我们推荐的答案对大家有所帮助，也希望大家多多支持IT屋！

查看全文

使用 Beautiful Soup 查找下一个出现的标签及其包含的文本 [英] Finding next occurring tag and its enclosed text with Beautiful Soup

问题描述

相关文章

前端开发最新文章

热门教程

热门工具

登录关闭

使用 Beautiful Soup 查找下一个出现的标签及其包含的文本 [英] Finding next occurring tag and its enclosed text with Beautiful Soup

问题描述

相关文章

前端开发最新文章

热门教程

热门工具

登录 关闭

登录关闭