两个标签之间的Python HTML解析 [英] Python HTML Parsing Between two tags

查看：79 发布时间：2020/5/25 1:44:14 python html parsing beautifulsoup

本文介绍了两个标签之间的Python HTML解析的处理方法，对大家解决问题具有一定的参考价值，需要的朋友们下面随着小编来一起学习吧！

问题描述

今天，我正在研究一个小型文件上传器，并且从API页面获得了以下响应.

Today I was looking into a small file uploader and I got the following response from the API page.

upload_success<br>http://www.filepup.net/files/R6wVq1405781467.html<br>http://www.filepup.net/delete/Jp3q5w1405781467/R6wVq1405781467.html

我需要得到两个<br>标记之间的部分.我正在使用Beautifulsoup和这段代码，但是它返回None.

I need to get the part between the two <br> tags. I am using Beautifulsoup and this code but it returns None.

fpbs = BeautifulSoup(filepup.text)
finallink = fpbs.find('br', 'br')
print(finallink)

推荐答案

您不能在两个标签之间搜索文本，不能.您可以找到第一个<br>标签，然后使用其下一个兄弟姐妹，但是:

You cannot search for text between two tags, no. You can locate the first <br> tag, then take its next sibling, however:

>>> soup = BeautifulSoup('upload_success<br>http://www.filepup.net/files/R6wVq1405781467.html<br>http://www.filepup.net/delete/Jp3q5w1405781467/R6wVq1405781467.html')
>>> soup.find('br')
<br/>
>>> soup.find('br').next_sibling
u'http://www.filepup.net/files/R6wVq1405781467.html'

您可以使用 CSS选择器搜索来搜索相邻的兄弟，然后抢前一个同级；使用CSS时，只有标签是同级标签，而使用BeautifulSoup时，文本节点也要计数.

You could use a CSS selector search to search for an adjacent sibling, then grab the preceding sibling; to CSS only the tags are siblings, but to BeautifulSoup the text nodes count too.

两个CSS选择器之间的相邻选择是+，并选择两个中的第二个； br + br会选择第二位的任何br标签.

The adjacent select is + between two CSS selectors, and selects the second of the two; br + br would select any br tag that comes second.

与父节点(例如特定的ID或类)一起使用，可能是非常强大的组合:

Together with a parent node (say a specific id or class) that can be a very powerful combination:

>>> soup = BeautifulSoup('''\
... <div id="div1">
...     some text
...     <br/>
...     some target text
...     <br/>
...     foo bar
... </div>
... <div id="div2">
...     some more text
...     <br/>
...     select me, ooh, pick me!
...     <br/>
...     fooed the bar!
... </div>
... ''')
>>> soup.select('#div2 br + br')[0]
<br/>
>>> soup.select('#div2 br + br')[0].previous_sibling
u'\n    select me, ooh, pick me!\n    '

这在特定的<div>标签中的两个<br>标签之间选择了一个非常特定的文本节点.

This picked a very specific text node between two <br> tags, in a specific <div> tag.

这篇关于两个标签之间的Python HTML解析的文章就介绍到这了，希望我们推荐的答案对大家有所帮助，也希望大家多多支持IT屋！

查看全文

两个标签之间的Python HTML解析 [英] Python HTML Parsing Between two tags

问题描述

推荐答案

相关文章

前端开发最新文章

热门教程

热门工具

登录关闭

两个标签之间的Python HTML解析 [英] Python HTML Parsing Between two tags

问题描述

推荐答案

相关文章

前端开发最新文章

热门教程

热门工具

登录 关闭

登录关闭