如何从蟒蛇美丽的汤从表TBODY？ [英] how to get tbody from table from python beautiful soup ?

查看：134 发布时间：2016/8/5 18:58:46 python web-scraping beautifulsoup

本文介绍了如何从蟒蛇美丽的汤从表TBODY？的处理方法，对大家解决问题具有一定的参考价值，需要的朋友们下面随着小编来一起学习吧！

问题描述

我想放弃年度＆安培;获奖者（第一＆安培;第二列）从表（第二个表）的总决赛比赛名单，从
http://en.wikipedia.org/wiki/List_of_FIFA_World_Cup_finals ：我使用的是code如下：

 进口的urllib2
从BeautifulSoup进口BeautifulSoupURL =http://www.samhsa.gov/data/NSDUH/2k10State/NSDUHsae2010/NSDUHsaeAppC2010.htm
汤= BeautifulSoup（urllib2.urlopen（URL）.read（））
soup.findAll（'表'）[0] .tbody.findAll（TR）
在soup.findAll（'表'）[0] .tbody.findAll（'TR'）行：
    FIRST_COLUMN = row.findAll（'日'）[0] .contents
    third_column = row.findAll（'TD'）[2] .contents
    打印FIRST_COLUMN，third_column

通过上面的code，我能得到第一和放大器; THRID列就好了。但是，当我用同样的code。与 http://en.wikipedia.org/wiki/List_of_FIFA_World_Cup_finals ，它找不到TBODY作为它的元素，但我可以看到TBODY当我检查的元素。

  URL =http://en.wikipedia.org/wiki/List_of_FIFA_World_Cup_finals
汤= BeautifulSoup（urllib2.urlopen（URL）.read（））打印soup.findAll（'表'）[2]    soup.findAll（'表'）[2] .tbody.findAll（TR）
    在soup.findAll（'表'）[0] .tbody.findAll（'TR'）行：
        FIRST_COLUMN = row.findAll（'日'）[0] .contents
        third_column = row.findAll（'TD'）[2] .contents
        打印FIRST_COLUMN，third_column

下面是我从评论的错误了：

 
-------------------------------------------------- -------------------------
AttributeError的回溯（最新最后调用）
＆LT; IPython的输入-150-fedd08c6da16＆GT;上述＆lt;模块＆GT;（）
      7＃打印soup.findAll（'表'）[2]
      8
----＆GT; 9 soup.findAll（'表'）[2] .tbody.findAll（TR）
     10在soup.findAll行（'表'）[0] .tbody.findAll（TR）：
     11 FIRST_COLUMN = row.findAll（'日'）[0] .contentsAttributeError异常：'NoneType'对象有没有属性'的findAll

解决方案

如果您通过在浏览器中检查工具检查它会插入 TBODY 标记。

源$ C $ C，可以，或可以不包含它们。我建议在看源代码视图，如果你真的想知道的。

无论哪种方式，你并不需要遍历到TBODY，简单地说：

soup.findAll（'表'）[0] .findAll（'TR'）应该工作。

I'm trying to scrap Year & Winners ( first & second columns ) from "List of finals matches" table (second table) from http://en.wikipedia.org/wiki/List_of_FIFA_World_Cup_finals: I'm using the code below:

import urllib2
from BeautifulSoup import BeautifulSoup

url = "http://www.samhsa.gov/data/NSDUH/2k10State/NSDUHsae2010/NSDUHsaeAppC2010.htm"
soup = BeautifulSoup(urllib2.urlopen(url).read())
soup.findAll('table')[0].tbody.findAll('tr')
for row in soup.findAll('table')[0].tbody.findAll('tr'):
    first_column = row.findAll('th')[0].contents
    third_column = row.findAll('td')[2].contents
    print first_column, third_column

With the above code, I was able to get first & thrid column just fine. But when I use the same code with http://en.wikipedia.org/wiki/List_of_FIFA_World_Cup_finals, It could not find tbody as its element, but I can see the tbody when I inspect the element.

url = "http://en.wikipedia.org/wiki/List_of_FIFA_World_Cup_finals"
soup = BeautifulSoup(urllib2.urlopen(url).read())

print soup.findAll('table')[2]

    soup.findAll('table')[2].tbody.findAll('tr')
    for row in soup.findAll('table')[0].tbody.findAll('tr'):
        first_column = row.findAll('th')[0].contents
        third_column = row.findAll('td')[2].contents
        print first_column, third_column

Here's what I got from comment error:

'
---------------------------------------------------------------------------
AttributeError                            Traceback (most recent call last)
<ipython-input-150-fedd08c6da16> in <module>()
      7 # print soup.findAll('table')[2]
      8 
----> 9 soup.findAll('table')[2].tbody.findAll('tr')
     10 for row in soup.findAll('table')[0].tbody.findAll('tr'):
     11     first_column = row.findAll('th')[0].contents

AttributeError: 'NoneType' object has no attribute 'findAll'

'

解决方案

If you are inspecting through the inspect tool in the browser it will insert the tbody tags.

The source code, may, or may not contain them. I suggest looking at the source view if you really want to know.

Either way, you do not need to traverse to the tbody, simply:

soup.findAll('table')[0].findAll('tr') should work.

这篇关于如何从蟒蛇美丽的汤从表TBODY？的文章就介绍到这了，希望我们推荐的答案对大家有所帮助，也希望大家多多支持IT屋！

查看全文

如何从蟒蛇美丽的汤从表TBODY？ [英] how to get tbody from table from python beautiful soup ?

问题描述

相关文章

Python最新文章

热门教程

热门工具

登录关闭

如何从蟒蛇美丽的汤从表TBODY？ [英] how to get tbody from table from python beautiful soup ?

问题描述

相关文章

Python最新文章

热门教程

热门工具

登录 关闭

登录关闭