无法从BeautifulSoup获取表 [英] Unable to fetch Table from BeautifulSoup

查看:79
本文介绍了无法从BeautifulSoup获取表的处理方法,对大家解决问题具有一定的参考价值,需要的朋友们下面随着小编来一起学习吧!

问题描述

from BeautifulSoup import BeautifulSoup
import urllib2

url = 'http://www.data.jma.go.jp/obd/stats/etrn/view/monthly_s3_en.php?block_no=47401&view=1'
html = urllib2.urlopen(url).read()        
soup = BeautifulSoup(html)
table = soup.find('table')
print table

未生成预期的表.

我想抓住下表:

推荐答案

首先,使用 bs4 beaufifulsoup3 不再维护,您想要的表也具有类*data2_s*,调用find("table")只会获取页面上的第一个表这不是您想要的:

First off, use bs4 beaufifulsoup3 is no longer maintained, also the table you want has the class *data2_s*, calling find("table") just gets the first table on the page which is not what you want:

from bs4 import BeautifulSoup
import urllib2

url = 'http://www.data.jma.go.jp/obd/stats/etrn/view/monthly_s3_en.php?block_no=47401&view=1'
html = urllib2.urlopen(url).read()
soup = BeautifulSoup(html)
table = soup.select_one("table.data2_s") # or table = soup.find("table", class_="data2_s")
print table

哪个给您:

<table class="data2_s"><caption class="m">WAKKANAI   WMO Station ID:47401 Lat 45<sup>o</sup>24.9'N  Lon 141<sup>o</sup>40.7'E</caption><tr><th scope="col">Year</th><th scope="col">Jan</th><th scope="col">Feb</th><th scope="col">Mar</th><th scope="col">Apr</th><th scope="col">May</th><th scope="col">Jun</th><th scope="col">Jul</th><th scope="col">Aug</th><th scope="col">Sep</th><th scope="col">Oct</th><th scope="col">Nov</th><th scope="col">Dec</th><th scope="col">Annual</th></tr><tr class="mtx" style="text-align:right;"><td style="text-align:center">1938</td><td class="data_0_0_0_0">-5.2</td><td class="data_0_0_0_0">-4.9</td><td class="data_0_0_0_0">-0.6</td><td class="data_0_0_0_0">4.7</td><td class="data_0_0_0_0">9.5</td><td class="data_0_0_0_0">11.6</td><td class="data_0_0_0_0">17.9</td><td class="data_0_0_0_0">22.2</td><td class="data_0_0_0_0">16.5</td><td class="data_0_0_0_0">10.7</td><td class="data_0_0_0_0">3.3</td><td class="data_0_0_0_0">-4.7</td><td class="data_0_0_0_0">6.8</td></tr>
<tr class="mtx" style="text-align:right;"><td style="text-align:center">1939</td><td class="data_0_0_0_0">-7.5</td><td class="data_0_0_0_0">-6.6</td><td class="data_0_0_0_0">-1.4</td><td class="data_0_0_0_0">4.0</td><td class="data_0_0_0_0">7.5</td><td class="data_0_0_0_0">13.0</td><td class="data_0_0_0_0">17.4</td><td class="data_0_0_0_0">20.0</td><td class="data_0_0_0_0">17.4</td><td class="data_0_0_0_0">9.7</td><td class="data_0_0_0_0">3.0</td><td class="data_0_0_0_0">-2.5</td><td class="data_0_0_0_0">6.2</td></tr>
<tr class="mtx" style="text-align:right;"><td style="text-align:center">1940</td><td class="data_0_0_0_0">-6.0</td><td class="data_0_0_0_0">-5.7</td><td class="data_0_0_0_0">-0.5</td><td class="data_0_0_0_0">3.5</td><td class="data_0_0_0_0">8.5</td><td class="data_0_0_0_0">11.0</td><td class="data_0_0_0_0">16.6</td><td class="data_0_0_0_0">19.7</td><td class="data_0_0_0_0">15.6</td><td class="data_0_0_0_0">10.4</td><td class="data_0_0_0_0">3.7</td><td class="data_0_0_0_0">-1.0</td><td class="data_0_0_0_0">6.3</td></tr>
<tr class="mtx" style="text-align:right;"><td style="text-align:center">1941</td><td class="data_0_0_0_0">-6.5</td><td class="data_0_0_0_0">-5.8</td><td class="data_0_0_0_0">-2.6</td><td class="data_0_0_0_0">3.6</td><td class="data_0_0_0_0">8.1</td><td class="data_0_0_0_0">11.4</td><td class="data_0_0_0_0">12.7</td><td class="data_0_0_0_0">16.5</td><td class="data_0_0_0_0">16.0</td><td class="data_0_0_0_0">10.0</td><td class="data_0_0_0_0">4.0</td><td class="data_0_0_0_0">-2.9</td><td class="data_0_0_0_0">5.4</td></tr>
<tr class="mtx" style="text-align:right;"><td style="text-align:center">1942</td><td class="data_0_0_0_0">-7.8</td><td class="data_0_0_0_0">-8.2</td><td class="data_0_0_0_0">-0.8</td><td class="data_0_0_0_0">3.5</td><td class="data_0_0_0_0">7.1</td><td class="data_0_0_0_0">12.0</td><td class="data_0_0_0_0">17.4</td><td class="data_0_0_0_0">18.4</td><td class="data_0_0_0_0">15.7</td><td class="data_0_0_0_0">10.5</td><td class="data_0_0_0_0">2.5</td><td class="data_0_0_0_0">-2.9</td><td class="data_0_0_0_0">5.6</td></tr>
etc...................................

这篇关于无法从BeautifulSoup获取表的文章就介绍到这了,希望我们推荐的答案对大家有所帮助,也希望大家多多支持IT屋!

查看全文
登录 关闭
扫码关注1秒登录
发送“验证码”获取 | 15天全站免登陆