BeautifulSoup 4,findNext()函数 [英] BeautifulSoup 4, findNext() function
问题描述
我正在使用BeautifulSoup 4,并且我有以下html代码:
I'm playing with BeautifulSoup 4 and I have this html code:
</tr>
<tr>
<td id="freistoesse">Giraffe</td>
<td>14</td>
<td>7</td>
</tr>
我想匹配<td>
标签之间的两个值,因此这里是14和7.
I want to match both values between <td>
tags so here 14 and 7.
我尝试过:
giraffe = soup.find(text='Giraffe').findNext('td').text
,但这仅匹配14
.如何将两个值与此函数匹配?
but this only matches 14
. How can I match both values with this function?
推荐答案
使用find_all
代替findNext
:
import bs4 as bs
content = '''\
<tr>
<td id="freistoesse">Giraffe</td>
<td>14</td>
<td>7</td>
</tr>'''
soup = bs.BeautifulSoup(content)
for td in soup.find('td', text='Giraffe').parent.find_all('td'):
print(td.text)
收益
Giraffe
14
7
或者,您可以使用find_next_siblings
(也称为fetchNextSiblings
):
Or, you could use find_next_siblings
(also known as fetchNextSiblings
):
for td in soup.find(text='Giraffe').parent.find_next_siblings():
print(td.text)
收益
14
7
说明:
请注意,soup.find(text='Giraffe')
返回NavigableString.
Note that soup.find(text='Giraffe')
returns a NavigableString.
In [30]: soup.find(text='Giraffe')
Out[30]: u'Giraffe'
要获取关联的td
标签,请使用
To get the associated td
tag, use
In [31]: soup.find('td', text='Giraffe')
Out[31]: <td id="freistoesse">Giraffe</td>
或
In [32]: soup.find(text='Giraffe').parent
Out[32]: <td id="freistoesse">Giraffe</td>
一旦有了td
标记,就可以使用find_next_siblings
:
Once you have the td
tag, you could use find_next_siblings
:
In [35]: soup.find(text='Giraffe').parent.find_next_siblings()
Out[35]: [<td>14</td>, <td>7</td>]
PS. BeautifulSoup添加了使用下划线而不是CamelCase的方法名称.它们执行相同的操作,但符合PEP8样式指南的建议.因此,与fetchNextSiblings
相比,更优选find_next_siblings
.
PS. BeautifulSoup has added method names that use underscores instead of CamelCase. They do the same thing, but comform to the PEP8 style guide recommendations. Thus, prefer find_next_siblings
over fetchNextSiblings
.
这篇关于BeautifulSoup 4,findNext()函数的文章就介绍到这了,希望我们推荐的答案对大家有所帮助,也希望大家多多支持IT屋!