即使使用.loc后, pandas 仍会收到SettingWithCopyWarning [英] Pandas still getting SettingWithCopyWarning even after using .loc
问题描述
起初,我尝试编写一些看起来像这样的代码:
import numpy as np
import pandas as pd
np.random.seed(2016)
train = pd.DataFrame(np.random.choice([np.nan, 1, 2], size=(10, 3)),
columns=['Age', 'SibSp', 'Parch'])
complete = train.dropna()
complete['AgeGt15'] = complete['Age'] > 15
获取SettingWithCopyWarning之后,我尝试使用.loc:
complete.loc[:, 'AgeGt15'] = complete['Age'] > 15
complete.loc[:, 'WithFamily'] = complete['SibSp'] + complete['Parch'] > 0
但是,我仍然收到相同的警告.有什么作用?
注意:自熊猫0.24版起,已弃用is_copy
,并将在以后的版本中将其删除.当私有属性_is_copy
存在时,下划线表示该属性不是公共API的一部分,因此不应依赖于此属性.因此,展望未来,使SettingWithCopyWarning
静音的唯一正确方法似乎是在全球范围内这样做:
pd.options.mode.chained_assignment = None
当执行complete = train.dropna()
时,dropna
可能会返回一个副本,因此
出于谨慎考虑,Pandas将complete.is_copy
设置为Truthy
值:
In [220]: complete.is_copy
Out[220]: <weakref at 0x7f7f0b295b38; to 'DataFrame' at 0x7f7eee6fe668>
这允许熊猫稍后执行complete['AgeGt15'] = complete['Age'] > 15
时警告您,您可能正在修改对train
无效的副本.对于初学者来说,这可能是一个有用的警告.在您的情况下,您似乎无意通过修改complete
间接修改train
.因此,警告对于您而言只是个无意义的烦恼.
您可以通过设置
使警告静音complete.is_copy = False # deprecated as of version 0.24
这比制作实际副本要快,并且可以将SettingWithCopyWarning
夹在萌芽中(在 解决方案
Note: As of pandas version 0.24, is_copy
is deprecated and will be removed in a future version. While the private attribute _is_copy
exists, the underscore indicates this attribute is not part of the public API and therefore should not be depended upon. Therefore, going forward, it seems the only proper way to silence SettingWithCopyWarning
will be to do so globally:
pd.options.mode.chained_assignment = None
When complete = train.dropna()
is executed, dropna
might return a copy, so
out of an abundance of caution, Pandas sets complete.is_copy
to a Truthy
value:
In [220]: complete.is_copy
Out[220]: <weakref at 0x7f7f0b295b38; to 'DataFrame' at 0x7f7eee6fe668>
This allows Pandas to warn you later, when complete['AgeGt15'] = complete['Age'] > 15
is executed that you may be modifying a copy which will have no effect on train
. For beginners this may be a useful warning. In your case, it appears you have no intention of modifying train
indirectly by modifying complete
. Therefore the warning is just a meaningless annoyance in your case.
You can silence the warning by setting,
complete.is_copy = False # deprecated as of version 0.24
This is quicker than making an actual copy, and nips the SettingWithCopyWarning
in the bud (at the point where _check_setitem_copy
is called):
def _check_setitem_copy(self, stacklevel=4, t='setting', force=False):
if force or self.is_copy:
...
If you are really confident you know what you are doing, you can shut off the SettingWithCopyWarning
globally with
pd.options.mode.chained_assignment = None # None|'warn'|'raise'
An alternative way to silence the warning is to make a new copy:
complete = complete.copy()
However, you may not want to do this if the DataFrame is large, since copying
can take a significant amount of time and memory, and it is
completely pointless (except for the sake of silencing a warning) if you know complete
is already a copy.
这篇关于即使使用.loc后, pandas 仍会收到SettingWithCopyWarning的文章就介绍到这了,希望我们推荐的答案对大家有所帮助,也希望大家多多支持IT屋!