Python的beautifulsoup 1级纯文本 [英] Python beautifulsoup level 1 only text

查看：342 发布时间：2016/8/5 19:16:51 python beautifulsoup

本文介绍了Python的beautifulsoup 1级纯文本的处理方法，对大家解决问题具有一定的参考价值，需要的朋友们下面随着小编来一起学习吧！

问题描述

我看了其他beautifulsoup获得同级别类型的问题。好像我的略有不同。

I've looked at the other beautifulsoup get same level type questions. Seems like my is slightly different.

下面是网站http://engine.data.cnzz.com/main.php?s=engine&uv=&st=2014-03-01&et=2014-03-31

我试图让右边的表。请注意表格的第一行是如何扩展成一个详细的分解数据的下降。我不希望这样的数据。我只希望最高层的数据。你还可以看到，其它行也可以扩展，但不是在这种情况下。所以只要循环和跳过 TR [2] 可能无法正常工作。我试过这样：

I'm trying to get that table on the right. Notice how the first row of the table expands into a detailed break down of that data. I don't want that data. I only want the very top level data. You can also see that the other rows also can be expanded, but not in this case. So just looping and skipping tr[2] might not work. I've tried this:

r = requests.get(page)
r.encoding = 'gb2312'
soup = BeautifulSoup(r.text,'html.parser')
table=soup.find('div', class_='right1').findAll('tr', {"class" : re.compile('list.*')})

但仍有多个嵌套列表* 在其他级别。如何获取只有第一个层次？

but there is still more nested list* at other levels. How to get only the first level?

推荐答案

将搜索范围限制直接表元素的儿童只有通过设置的 递归参数来错误：

Limit your search to direct children of the table element only by setting the recursive argument to False:

table = soup.find('div', class_='right1').table
rows = table.find_all('tr', {"class" : re.compile('list.*')}, recursive=False)

这篇关于Python的beautifulsoup 1级纯文本的文章就介绍到这了，希望我们推荐的答案对大家有所帮助，也希望大家多多支持IT屋！

查看全文

Python的beautifulsoup 1级纯文本 [英] Python beautifulsoup level 1 only text

问题描述

推荐答案

相关文章

Python最新文章

热门教程

热门工具

登录关闭

Python的beautifulsoup 1级纯文本 [英] Python beautifulsoup level 1 only text

问题描述

推荐答案

相关文章

Python最新文章

热门教程

热门工具

登录 关闭

登录关闭