pandas -列未读但存在 [英] Pandas - Columns not read though Present
本文介绍了 pandas -列未读但存在的处理方法,对大家解决问题具有一定的参考价值,需要的朋友们下面随着小编来一起学习吧!
问题描述
我有以下数据集.
url, team1, team2, win_toss, bat_or_bowl, outcome, win_game, date,day_n_night, ground, rain, duckworth_lewis, match_id, type_of_match
"espncricinfo-t20/145227.html","Western Australia","Victoria","Victoria","bat","Western Australia won by 8 wickets (with 47 balls remaining)","Western Australia"," Jan 12 2005","1"," Western Australia Cricket Association Ground,Perth","0","0","145227","T20"
"espncricinfo-t20/212961.html","Australian Institute of Sports","New Zealand Academy","New Zealand Academy","bowl","Match tied",""," Jul 7 2005 ","0"," Albury Oval, Brisbane","0","0","212961","T20"
"espncricinfo-t20/216598.html","Air India","New South Wales","Air India","bowl","Air India won by 7 wickets (with 5 balls remaining)","Air India"," Aug 19 2005 ","0"," M Chinnaswamy Stadium, Bangalore","0","0","216598","T20"
"espncricinfo-t20/216620.html","Karnataka State Cricket Association XI","Bradman XI","Bradman XI","bowl","Karnataka State Cricket Association XI won by 33 runs","Karnataka State Cricket Association XI"," Aug 20 2005 ","0"," M Chinnaswamy Stadium, Bangalore","0","0","216620","T20"
"espncricinfo-t20/216633.html","Chemplast","Bradman XI","Chemplast","bat","Bradman XI won by 6 wickets (with 13 balls remaining)","Bradman XI"," Aug 20 2005 ","0"," M Chinnaswamy Stadium, Bangalore","0","0","216633","T20"
这是来自python控制台:
**
>>> import pandas as pd
>>> df = pd.read_csv("sample.txt" , quotechar = '\"')
>>> df.shape
(9, 14)
>>> df.columns
Index([u'url', u' team1', u' team2', u' win_toss', u' bat_or_bowl',
u' outcome', u' win_game', u' date', u' day_n_night', u' ground',
u' rain', u' duckworth_lewis', u' match_id', u' type_of_match'],
dtype='object')
>>> df.url.head()
0 espncricinfo-t20/145227.html
1 espncricinfo-t20/212961.html
2 espncricinfo-t20/216598.html
3 espncricinfo-t20/216620.html
4 espncricinfo-t20/216633.html
Name: url, dtype: object
>>> df.team1.head()
Traceback (most recent call last):
File "<stdin>", line 1, in <module>
File "/usr/local/python27/lib/python2.7/site-packages/pandas/core/generic.py", line 2744, in __getattr__
return object.__getattribute__(self, name)
AttributeError: 'DataFrame' object has no attribute 'team1'
>>> df.iloc[1:2]
url team1 \
1 espncricinfo-t20/212961.html Australian Institute of Sports
team2 win_toss bat_or_bowl outcome \
1 New Zealand Academy New Zealand Academy bowl Match tied
win_game date day_n_night ground rain \
1 NaN Jul 7 2005 0 Albury Oval, Brisbane 0
duckworth_lewis match_id type_of_match
1 0 212961 T20
我们可以看到team1列存在,但是我无法从Df中检索到它.对于除first之外的所有列,我都会收到此错误.谁能帮我在这里找到问题!谢谢
We can see the column team1 exists but i am unable to retrieve it from Df. I get this error for all columns except for first . Could anyone please help me find the problem here ! Thanks
推荐答案
您有一个前导空格:
u' team1'
在该列中,因此它引发KeyError
in the column so it raises KeyError
执行以下操作:
pd.read_csv("sample.txt" , quotechar = '\"', skipinitialspace=True)
因此读取csv并忽略了前导空格
so the csv is read and ignores the leading space
请参见文档
这篇关于 pandas -列未读但存在的文章就介绍到这了,希望我们推荐的答案对大家有所帮助,也希望大家多多支持IT屋!
查看全文