从AttributeError的Python中的URL中提取数据时 [英] AttributeError when extracting data from a URL in Python
本文介绍了从AttributeError的Python中的URL中提取数据时的处理方法,对大家解决问题具有一定的参考价值,需要的朋友们下面随着小编来一起学习吧!
问题描述
我用下面的code,试图提取物在表格中的数据的网址。不过,我得到了以下错误消息:
错误:`AttributeError异常:'NoneType'对象有没有属性'find'`in
该行`数据= ITER(soup.find(表,{一流:
tablestats})找到(TH,{级:头})find_all_next(TR))`。
我的code是如下:
从BS4进口BeautifulSoup
进口要求 R = requests.get(
http://www.federalreserve.gov/econresdata/researchdata/feds200628_1.html)
汤= BeautifulSoup(r.content) 数据= ITER(soup.find(表,{级:tablestats})找到(TH,{级:头})find_all_next(TR))
标题=(下一个(数据)的.text,下一个(数据)的.text)
table_items = [(a.text,b.text)为ELE在用于数据中的b [ele.find_all(TD)]] 为A,B在table_items:
打印(U日期= {},成熟度= {}。格式(A,B,如果b.strip()否则空))
感谢您
解决方案
从BS4进口BeautifulSoup
进口要求
R = requests.get(
http://www.federalreserve.gov/econresdata/researchdata/feds200628_1.html)
汤= BeautifulSoup(r.content) #列标题
H = data.find_all(TH,范围=山口)
#得到所有的标题后,TR标签
最后= [[t.th.text] + [ele.text在t.find_all ELE(TD)在H [-1] .find_all_next T(TR)]
标题= [th.text在H次]
最后出来列表在各个列表中的所有行:
[2015年6月5日,4.82039691,-4.66420959','-4.18904598',
'-3.94541434','1.1477,2.9361,3.3588,0.6943,1.5881,
2.3034,2.7677,3.0363,3.1801,3.2537,3.2930,3.3190,
3.3431,3.3707,3.4038,3.4428,3.4871,3.5357,3.5876,
3.6419,3.6975,3.7538,3.8100,3.8656,3.9202,3.9734,
4.0250,4.0748,4.1225,4.1682,4.2117,4.2530,4.2921,
0.3489,0.7464,1.1502,1.4949,1.7700,1.9841,2.1500,
2.2800,2.3837,2.4685,2.5396,2.6006,2.6544,2.7027,
2.7469,2.7878,2.8260,2.8621,2.8964,2.9291,2.9603,
2.9901,3.0187,3.0461,3.0724,3.0976,3.1217,3.1448,
3.1669,3.1881,0.3487,0.7469,1.1536,1.5039,1.7862,
2.0078,2.1811,2.3179,2.4277,2.5181,2.5943,2.6603,
2.7190,2.7722,2.8215,2.8677,2.9117,2.9538,2.9944,
3.0338,3.0721,3.1094,3.1458,3.1814,3.2161,3.2501,
3.2832,3.3156,3.3472,3.3781,1.40431658','9.48795888'],
['选择月份','4.64953424','-4.52780982','-3.98051369',
......................................
标头:
['BETA0','BETA1','BETA2','BETA3','SVEN1F01','SVEN1F04','SVEN1F09','SVENF01','SVENF02',' SVENF03','SVENF04','SVENF05','SVENF06','SVENF07','SVENF08','SVENF09','SVENF10','SVENF11','SVENF12','SVENF13','SVENF14','SVENF15' 'SVENF16','SVENF17','SVENF18','SVENF19','SVENF20','SVENF21','SVENF22','SVENF23','SVENF24','SVENF25','SVENF26','SVENF27',' SVENF28','SVENF29','SVENF30','SVENPY01','SVENPY02','SVENPY03','SVENPY04','SVENPY05','SVENPY06','SVENPY07','SVENPY08','SVENPY09','SVENPY10' 'SVENPY11','SVENPY12','SVENPY13','SVENPY14','SVENPY15','SVENPY16','SVENPY17','SVENPY18','SVENPY19','SVENPY20','SVENPY21','SVENPY22',' SVENPY23','SVENPY24','SVENPY25','SVENPY26','SVENPY27','SVENPY28','SVENPY29','SVENPY30','SVENY01','SVENY02','SVENY03','SVENY04','SVENY05' 'SVENY06','SVENY07','SVENY08','SVENY09','SVENY10','SVENY11','SVENY12','SVENY13','SVENY14','SVENY15','SVENY16','SVENY17',' SVENY18','SVENY19','SVENY20','SVENY21','SVENY22','SVENY23','SVENY24','SVENY25','SVENY26','SVENY27','SVENY28','SVENY29','SVENY30' 'TAU1','TAU2']
I am using the code below to try an extract the data at the table in this URL. However, I get the following error message:
Error: `AttributeError: 'NoneType' object has no attribute 'find'`in
the line `data = iter(soup.find("table", {"class":
"tablestats"}).find("th", {"class": "header"}).find_all_next("tr"))`
My code is as follows:
from bs4 import BeautifulSoup
import requests
r = requests.get(
"http://www.federalreserve.gov/econresdata/researchdata/feds200628_1.html")
soup = BeautifulSoup(r.content)
data = iter(soup.find("table", {"class": "tablestats"}).find("th", {"class": "header"}).find_all_next("tr"))
headers = (next(data).text, next(data).text)
table_items = [(a.text, b.text) for ele in data for a, b in [ele.find_all("td")]]
for a, b in table_items:
print(u"Date={}, Maturity={}".format(a, b if b.strip() else "null"))
Thank You
解决方案
from bs4 import BeautifulSoup
import requests
r = requests.get(
"http://www.federalreserve.gov/econresdata/researchdata/feds200628_1.html")
soup = BeautifulSoup(r.content)
# column headers
h = data.find_all("th", scope="col")
# get all the tr tags after the headers
final = [[t.th.text] + [ele.text for ele in t.find_all("td")] for t in h[-1].find_all_next("tr")]
headers = [th.text for th in h]
The final out list is all the rows in individual lists:
[['2015-06-05', '4.82039691', '-4.66420959', '-4.18904598',
'-3.94541434', '1.1477', '2.9361', '3.3588', '0.6943', '1.5881',
'2.3034', '2.7677', '3.0363', '3.1801', '3.2537', '3.2930', '3.3190',
'3.3431', '3.3707', '3.4038', '3.4428', '3.4871', '3.5357', '3.5876',
'3.6419', '3.6975', '3.7538', '3.8100', '3.8656', '3.9202', '3.9734',
'4.0250', '4.0748', '4.1225', '4.1682', '4.2117', '4.2530', '4.2921',
'0.3489', '0.7464', '1.1502', '1.4949', '1.7700', '1.9841', '2.1500',
'2.2800', '2.3837', '2.4685', '2.5396', '2.6006', '2.6544', '2.7027',
'2.7469', '2.7878', '2.8260', '2.8621', '2.8964', '2.9291', '2.9603',
'2.9901', '3.0187', '3.0461', '3.0724', '3.0976', '3.1217', '3.1448',
'3.1669', '3.1881', '0.3487', '0.7469', '1.1536', '1.5039', '1.7862',
'2.0078', '2.1811', '2.3179', '2.4277', '2.5181', '2.5943', '2.6603',
'2.7190', '2.7722', '2.8215', '2.8677', '2.9117', '2.9538', '2.9944',
'3.0338', '3.0721', '3.1094', '3.1458', '3.1814', '3.2161', '3.2501',
'3.2832', '3.3156', '3.3472', '3.3781', '1.40431658', '9.48795888'],
['2015-06-04', '4.64953424', '-4.52780982', '-3.98051369',
......................................
The headers:
['BETA0', 'BETA1', 'BETA2', 'BETA3', 'SVEN1F01', 'SVEN1F04', 'SVEN1F09', 'SVENF01', 'SVENF02', 'SVENF03', 'SVENF04', 'SVENF05', 'SVENF06', 'SVENF07', 'SVENF08', 'SVENF09', 'SVENF10', 'SVENF11', 'SVENF12', 'SVENF13', 'SVENF14', 'SVENF15', 'SVENF16', 'SVENF17', 'SVENF18', 'SVENF19', 'SVENF20', 'SVENF21', 'SVENF22', 'SVENF23', 'SVENF24', 'SVENF25', 'SVENF26', 'SVENF27', 'SVENF28', 'SVENF29', 'SVENF30', 'SVENPY01', 'SVENPY02', 'SVENPY03', 'SVENPY04', 'SVENPY05', 'SVENPY06', 'SVENPY07', 'SVENPY08', 'SVENPY09', 'SVENPY10', 'SVENPY11', 'SVENPY12', 'SVENPY13', 'SVENPY14', 'SVENPY15', 'SVENPY16', 'SVENPY17', 'SVENPY18', 'SVENPY19', 'SVENPY20', 'SVENPY21', 'SVENPY22', 'SVENPY23', 'SVENPY24', 'SVENPY25', 'SVENPY26', 'SVENPY27', 'SVENPY28', 'SVENPY29', 'SVENPY30', 'SVENY01', 'SVENY02', 'SVENY03', 'SVENY04', 'SVENY05', 'SVENY06', 'SVENY07', 'SVENY08', 'SVENY09', 'SVENY10', 'SVENY11', 'SVENY12', 'SVENY13', 'SVENY14', 'SVENY15', 'SVENY16', 'SVENY17', 'SVENY18', 'SVENY19', 'SVENY20', 'SVENY21', 'SVENY22', 'SVENY23', 'SVENY24', 'SVENY25', 'SVENY26', 'SVENY27', 'SVENY28', 'SVENY29', 'SVENY30', 'TAU1', 'TAU2']
这篇关于从AttributeError的Python中的URL中提取数据时的文章就介绍到这了,希望我们推荐的答案对大家有所帮助,也希望大家多多支持IT屋!
查看全文