重命名列后得到keyerror [英] After rename column get keyerror
问题描述
我有df
:
df = pd.DataFrame({'a':[7,8,9],
'b':[1,3,5],
'c':[5,3,6]})
print (df)
a b c
0 7 1 5
1 8 3 3
2 9 5 6
然后通过此重命名第一个值:
Then rename first value by this:
df.columns.values[0] = 'f'
一切似乎都很好:
print (df)
f b c
0 7 1 5
1 8 3 3
2 9 5 6
print (df.columns)
Index(['f', 'b', 'c'], dtype='object')
print (df.columns.values)
['f' 'b' 'c']
如果选择b
,效果很好:
print (df['b'])
0 1
1 3
2 5
Name: b, dtype: int64
但是如果选择a
,它将返回列f
:
But if select a
it return column f
:
print (df['a'])
0 7
1 8
2 9
Name: f, dtype: int64
如果选择f
会得到键盘错误.
And if select f
get keyerror.
print (df['f'])
#KeyError: 'f'
print (df.info())
#KeyError: 'f'
什么问题?有人可以解释吗?还是虫子?
What is problem? Can somebody explain it? Or bug?
推荐答案
不希望您更改values
属性.
尝试df.columns.values = ['a', 'b', 'c']
,您会得到:
---------------------------------------------------------------------------
AttributeError Traceback (most recent call last)
<ipython-input-61-e7e440adc404> in <module>()
----> 1 df.columns.values = ['a', 'b', 'c']
AttributeError: can't set attribute
这是因为pandas
检测到您正在尝试设置属性并停止您.
That's because pandas
detects that you are trying to set the attribute and stops you.
但是,它不能阻止您更改基础的values
对象本身.
However, it can't stop you from changing the underlying values
object itself.
当您使用rename
时,pandas
将跟进一堆清理工作.我在下面粘贴了源代码.
When you use rename
, pandas
follows up with a bunch of clean up stuff. I've pasted the source below.
最终,您所做的只是更改了值,而没有启动清理.您可以通过对_data.rename_axis
的后续调用来自己启动它(示例可以在下面的源代码中看到).这将强制执行清理,然后您可以访问['f']
Ultimately what you've done is altered the values without initiating the clean up. You can initiate it yourself with a followup call to _data.rename_axis
(example can be seen in source below). This will force the clean up to be run and then you can access ['f']
df._data = df._data.rename_axis(lambda x: x, 0, True)
df['f']
0 7
1 8
2 9
Name: f, dtype: int64
故事的寓意:用这种方式重命名列可能不是一个好主意.
Moral of the story: probably not a great idea to rename a column this way.
但这个故事很奇怪
but this story gets weirder
这很好
df = pd.DataFrame({'a':[7,8,9],
'b':[1,3,5],
'c':[5,3,6]})
df.columns.values[0] = 'f'
df['f']
0 7
1 8
2 9
Name: f, dtype: int64
这不很好
df = pd.DataFrame({'a':[7,8,9],
'b':[1,3,5],
'c':[5,3,6]})
print(df)
df.columns.values[0] = 'f'
df['f']
KeyError:
结果是,我们可以在显示df
之前修改values
属性,并且显然它将在第一个display
上运行所有初始化.如果在更改values
属性之前显示它,它将出错.
Turns out, we can modify the values
attribute prior to displaying df
and it will apparently run all the initialization upon the first display
. If you display it prior to changing the values
attribute, it will error out.
更寂静
weirder still
df = pd.DataFrame({'a':[7,8,9],
'b':[1,3,5],
'c':[5,3,6]})
print(df)
df.columns.values[0] = 'f'
df['f'] = 1
df['f']
f f
0 7 1
1 8 1
2 9 1
好像我们还不知道这是个坏主意...
As if we didn't already know that this was a bad idea...
rename
def rename(self, *args, **kwargs):
axes, kwargs = self._construct_axes_from_arguments(args, kwargs)
copy = kwargs.pop('copy', True)
inplace = kwargs.pop('inplace', False)
if kwargs:
raise TypeError('rename() got an unexpected keyword '
'argument "{0}"'.format(list(kwargs.keys())[0]))
if com._count_not_none(*axes.values()) == 0:
raise TypeError('must pass an index to rename')
# renamer function if passed a dict
def _get_rename_function(mapper):
if isinstance(mapper, (dict, ABCSeries)):
def f(x):
if x in mapper:
return mapper[x]
else:
return x
else:
f = mapper
return f
self._consolidate_inplace()
result = self if inplace else self.copy(deep=copy)
# start in the axis order to eliminate too many copies
for axis in lrange(self._AXIS_LEN):
v = axes.get(self._AXIS_NAMES[axis])
if v is None:
continue
f = _get_rename_function(v)
baxis = self._get_block_manager_axis(axis)
result._data = result._data.rename_axis(f, axis=baxis, copy=copy)
result._clear_item_cache()
if inplace:
self._update_inplace(result._data)
else:
return result.__finalize__(self)
这篇关于重命名列后得到keyerror的文章就介绍到这了,希望我们推荐的答案对大家有所帮助,也希望大家多多支持IT屋!