从AttributeError的Python中的URL中提取数据时 [英] AttributeError when extracting data from a URL in Python

查看:271
本文介绍了从AttributeError的Python中的URL中提取数据时的处理方法,对大家解决问题具有一定的参考价值,需要的朋友们下面随着小编来一起学习吧!

问题描述

我用下面的code,试图提取物在表格中的数据的网址。不过,我得到了以下错误消息:

 错误:`AttributeError异常:'NoneType'对象有没有属性'find'`in
该行`数据= ITER(soup.find(表,{一流:
tablestats})找到(TH,{级:头})find_all_next(TR))`。

我的code是如下:

 从BS4进口BeautifulSoup
 进口要求 R = requests.get(
 http://www.federalreserve.gov/econresdata/researchdata/feds200628_1.html)
 汤= BeautifulSoup(r.content) 数据= ITER(soup.find(表,{级:tablestats})找到(TH,{级:头})find_all_next(TR))
 标题=(下一个(数据)的.text,下一个(数据)的.text)
 table_items = [(a.text,b.text)为ELE在用于数据中的b [ele.find_all(TD)]] 为A,B在table_items:
     打印(U日期= {},成熟度= {}。格式(A,B,如果b.strip()否则空))

感谢您


解决方案

 从BS4进口BeautifulSoup
进口要求
R = requests.get(
    http://www.federalreserve.gov/econresdata/researchdata/feds200628_1.html)
汤= BeautifulSoup(r.content) #列标题
 H = data.find_all(TH,范围=山口)
 #得到所有的标题后,TR标签
 最后= [[t.th.text] + [ele.text在t.find_all ELE(TD)在H [-1] .find_all_next T(TR)]
 标题= [th.text在H次]

最后出来列表在各个列表中的所有行:

  [2015年6月5日,4.82039691,-4.66420959','-4.18904598',
'-3.94541434','1.1477,2.9361,3.3588,0.6943,1.5881,
 2.3034,2.7677,3.0363,3.1801,3.2537,3.2930,3.3190,
3.3431,3.3707,3.4038,3.4428,3.4871,3.5357,3.5876,
 3.6419,3.6975,3.7538,3.8100,3.8656,3.9202,3.9734,
 4.0250,4.0748,4.1225,4.1682,4.2117,4.2530,4.2921,
 0.3489,0.7464,1.1502,1.4949,1.7700,1.9841,2.1500,
 2.2800,2.3837,2.4685,2.5396,2.6006,2.6544,2.7027,
 2.7469,2.7878,2.8260,2.8621,2.8964,2.9291,2.9603,
 2.9901,3.0187,3.0461,3.0724,3.0976,3.1217,3.1448,
 3.1669,3.1881,0.3487,0.7469,1.1536,1.5039,1.7862,
 2.0078,2.1811,2.3179,2.4277,2.5181,2.5943,2.6603,
 2.7190,2.7722,2.8215,2.8677,2.9117,2.9538,2.9944,
 3.0338,3.0721,3.1094,3.1458,3.1814,3.2161,3.2501,
 3.2832,3.3156,3.3472,3.3781,1.40431658','9.48795888'],
 ['选择月份','4.64953424','-4.52780982','-3.98051369',
 ......................................

标头:

  ['BETA0','BETA1','BETA2','BETA3','SVEN1F01','SVEN1F04','SVEN1F09','SVENF01','SVENF02',' SVENF03','SVENF04','SVENF05','SVENF06','SVENF07','SVENF08','SVENF09','SVENF10','SVENF11','SVENF12','SVENF13','SVENF14','SVENF15' 'SVENF16','SVENF17','SVENF18','SVENF19','SVENF20','SVENF21','SVENF22','SVENF23','SVENF24','SVENF25','SVENF26','SVENF27',' SVENF28','SVENF29','SVENF30','SVENPY01','SVENPY02','SVENPY03','SVENPY04','SVENPY05','SVENPY06','SVENPY07','SVENPY08','SVENPY09','SVENPY10' 'SVENPY11','SVENPY12','SVENPY13','SVENPY14','SVENPY15','SVENPY16','SVENPY17','SVENPY18','SVENPY19','SVENPY20','SVENPY21','SVENPY22',' SVENPY23','SVENPY24','SVENPY25','SVENPY26','SVENPY27','SVENPY28','SVENPY29','SVENPY30','SVENY01','SVENY02','SVENY03','SVENY04','SVENY05' 'SVENY06','SVENY07','SVENY08','SVENY09','SVENY10','SVENY11','SVENY12','SVENY13','SVENY14','SVENY15','SVENY16','SVENY17',' SVENY18','SVENY19'​​,'SVENY20','SVENY21','SVENY22','SVENY23','SVENY24','SVENY25','SVENY26','SVENY27','SVENY28','SVENY29','SVENY30' 'TAU1','TAU2']

I am using the code below to try an extract the data at the table in this URL. However, I get the following error message:

Error: `AttributeError: 'NoneType' object has no attribute 'find'`in 
the line `data = iter(soup.find("table", {"class": 
"tablestats"}).find("th", {"class": "header"}).find_all_next("tr"))`

My code is as follows:

 from bs4 import BeautifulSoup
 import requests

 r = requests.get(
 "http://www.federalreserve.gov/econresdata/researchdata/feds200628_1.html")
 soup = BeautifulSoup(r.content)

 data = iter(soup.find("table", {"class": "tablestats"}).find("th", {"class": "header"}).find_all_next("tr"))


 headers = (next(data).text, next(data).text)
 table_items =  [(a.text, b.text) for ele in data for a, b in [ele.find_all("td")]]

 for a, b in table_items:
     print(u"Date={}, Maturity={}".format(a, b if b.strip() else "null"))

Thank You

解决方案

from bs4 import BeautifulSoup
import requests


r = requests.get(
    "http://www.federalreserve.gov/econresdata/researchdata/feds200628_1.html")
soup = BeautifulSoup(r.content)

 # column headers
 h = data.find_all("th", scope="col")
 # get all the tr tags after the headers
 final = [[t.th.text] + [ele.text for ele in t.find_all("td")] for t in h[-1].find_all_next("tr")]
 headers = [th.text for th in h]

The final out list is all the rows in individual lists:

[['2015-06-05', '4.82039691', '-4.66420959', '-4.18904598', 
'-3.94541434', '1.1477', '2.9361', '3.3588', '0.6943', '1.5881',
 '2.3034', '2.7677', '3.0363', '3.1801', '3.2537', '3.2930', '3.3190', 
'3.3431', '3.3707', '3.4038', '3.4428', '3.4871', '3.5357', '3.5876',
 '3.6419', '3.6975', '3.7538', '3.8100', '3.8656', '3.9202', '3.9734',
 '4.0250', '4.0748', '4.1225', '4.1682', '4.2117', '4.2530', '4.2921',
 '0.3489', '0.7464', '1.1502', '1.4949', '1.7700', '1.9841', '2.1500', 
 '2.2800', '2.3837', '2.4685', '2.5396', '2.6006', '2.6544', '2.7027', 
 '2.7469', '2.7878', '2.8260', '2.8621', '2.8964', '2.9291', '2.9603',
 '2.9901', '3.0187', '3.0461', '3.0724', '3.0976', '3.1217', '3.1448',
 '3.1669', '3.1881', '0.3487', '0.7469', '1.1536', '1.5039', '1.7862',      
 '2.0078', '2.1811', '2.3179', '2.4277', '2.5181', '2.5943', '2.6603', 
 '2.7190', '2.7722', '2.8215', '2.8677', '2.9117', '2.9538', '2.9944', 
 '3.0338', '3.0721', '3.1094', '3.1458', '3.1814', '3.2161', '3.2501',
 '3.2832', '3.3156', '3.3472', '3.3781', '1.40431658', '9.48795888'], 
 ['2015-06-04', '4.64953424', '-4.52780982', '-3.98051369', 
 ......................................

The headers:

['BETA0', 'BETA1', 'BETA2', 'BETA3', 'SVEN1F01', 'SVEN1F04', 'SVEN1F09', 'SVENF01', 'SVENF02', 'SVENF03', 'SVENF04', 'SVENF05', 'SVENF06', 'SVENF07', 'SVENF08', 'SVENF09', 'SVENF10', 'SVENF11', 'SVENF12', 'SVENF13', 'SVENF14', 'SVENF15', 'SVENF16', 'SVENF17', 'SVENF18', 'SVENF19', 'SVENF20', 'SVENF21', 'SVENF22', 'SVENF23', 'SVENF24', 'SVENF25', 'SVENF26', 'SVENF27', 'SVENF28', 'SVENF29', 'SVENF30', 'SVENPY01', 'SVENPY02', 'SVENPY03', 'SVENPY04', 'SVENPY05', 'SVENPY06', 'SVENPY07', 'SVENPY08', 'SVENPY09', 'SVENPY10', 'SVENPY11', 'SVENPY12', 'SVENPY13', 'SVENPY14', 'SVENPY15', 'SVENPY16', 'SVENPY17', 'SVENPY18', 'SVENPY19', 'SVENPY20', 'SVENPY21', 'SVENPY22', 'SVENPY23', 'SVENPY24', 'SVENPY25', 'SVENPY26', 'SVENPY27', 'SVENPY28', 'SVENPY29', 'SVENPY30', 'SVENY01', 'SVENY02', 'SVENY03', 'SVENY04', 'SVENY05', 'SVENY06', 'SVENY07', 'SVENY08', 'SVENY09', 'SVENY10', 'SVENY11', 'SVENY12', 'SVENY13', 'SVENY14', 'SVENY15', 'SVENY16', 'SVENY17', 'SVENY18', 'SVENY19', 'SVENY20', 'SVENY21', 'SVENY22', 'SVENY23', 'SVENY24', 'SVENY25', 'SVENY26', 'SVENY27', 'SVENY28', 'SVENY29', 'SVENY30', 'TAU1', 'TAU2']

这篇关于从AttributeError的Python中的URL中提取数据时的文章就介绍到这了,希望我们推荐的答案对大家有所帮助,也希望大家多多支持IT屋!

查看全文
登录 关闭
扫码关注1秒登录
发送“验证码”获取 | 15天全站免登陆