BeautifulSoup find_all()是否保留标签顺序? [英] Does BeautifulSoup find_all() preserve tag order?

查看:324
本文介绍了BeautifulSoup find_all()是否保留标签顺序?的处理方法,对大家解决问题具有一定的参考价值,需要的朋友们下面随着小编来一起学习吧!

问题描述

我希望使用BeautifulSoup解析一些HMTL.我有一张有几行的桌子.我试图找到满足某些条件(某些属性值)的行,并稍后在我的代码中使用该行的索引.

I wish to use BeautifulSoup to parse some HMTL. I have a table with several rows. I'm trying to find a row that meets certain conditions (certain attribute values) and use the index of that row later on in my code.

问题是:find_all()是否在返回的结果集中保留行的顺序?

The question is: does find_all() preserve the order of my rows in the result set that it returns?

我在 docs 中找不到此文件,而Google却找到了我仅针对此答案:

I didn't find this in the docs and Googling got me only to this answer:

"BeautifulSoup标签不会在页面中跟踪其顺序,不."

'BeautifulSoup tags don't track their order in the page, no.'

但是他没有说他从哪里得到这些信息.

but he does not say where he got that information from.

我对答案很满意,但对一些解释该问题的文档的指针感到更加满意.

I'd be happy with an answer, but even more happy with a pointer to some documentation that explains this.

dstudeba使用next_sibling向我指出了这种解决方法"的方向.

dstudeba pointed me in the direction of this 'workaround' using next_sibling.

from bs4 import BeautifulSoup
soup = BeautifulSoup(open('./mytable.html'), 'html.parser')
row = soup.find('tr', {'class':'something', 'someattr':'somevalue'})
myvalues = []
while True:
    cell = row.find('td', {'someattr':'cellspecificvalue'})
    myvalues.append(cell.get_text())
    row = row.find_next_sibling('tr', {'class':'something', 'someattr':'somevalue'})
    if not row:
        break

这使我可以按需要在我的html文件中显示的顺序显示单元格内容.

This gets me the cell contents I need in the order they appear in my html file.

但是,我仍然想知道在BeautifulSoup文档中的什么地方我可以找到find_all()是否保留顺序.这就是为什么我不接受dstudeba的回答. (我的投票未显示,代表人数还不够:P)

However I'd still like to know where in the BeautifulSoup docs I could find whether find_all() preserves order or not. This is why I'm not accepting dstudeba's answer. (my upvote doesn't show, not enough rep yet :P)

推荐答案

根据我的经验,find_all确实保留了顺序.但是,请确保可以使用find_all_next方法,该方法使用find_next方法来保留订单. 此处是链接到文档.

It is my experience that find_all does preserve order. However to make sure you can use the find_all_next method which uses the find_next method which will preserve the order. Here is a link to the documentation.

这篇关于BeautifulSoup find_all()是否保留标签顺序?的文章就介绍到这了,希望我们推荐的答案对大家有所帮助,也希望大家多多支持IT屋!

查看全文
登录 关闭
扫码关注1秒登录
发送“验证码”获取 | 15天全站免登陆