即使使用.loc后, pandas 仍会收到SettingWithCopyWarning [英] Pandas still getting SettingWithCopyWarning even after using .loc

查看:109
本文介绍了即使使用.loc后, pandas 仍会收到SettingWithCopyWarning的处理方法,对大家解决问题具有一定的参考价值,需要的朋友们下面随着小编来一起学习吧!

问题描述

起初,我尝试编写一些看起来像这样的代码:

import numpy as np
import pandas as pd
np.random.seed(2016)
train = pd.DataFrame(np.random.choice([np.nan, 1, 2], size=(10, 3)), 
                     columns=['Age', 'SibSp', 'Parch'])

complete = train.dropna()    
complete['AgeGt15'] = complete['Age'] > 15

获取SettingWithCopyWarning之后,我尝试使用.loc:

complete.loc[:, 'AgeGt15'] = complete['Age'] > 15
complete.loc[:, 'WithFamily'] = complete['SibSp'] + complete['Parch'] > 0

但是,我仍然收到相同的警告.有什么作用?

解决方案

注意:自熊猫0.24版起,已弃用is_copy,并将在以后的版本中将其删除.当私有属性_is_copy存在时,下划线表示该属性不是公共API的一部分,因此不应依赖于此属性.因此,展望未来,使SettingWithCopyWarning静音的唯一正确方法似乎是在全球范围内这样做:

pd.options.mode.chained_assignment = None


当执行complete = train.dropna()时,dropna可能会返回一个副本,因此 出于谨慎考虑,Pandas将complete.is_copy设置为Truthy 值:

In [220]: complete.is_copy
Out[220]: <weakref at 0x7f7f0b295b38; to 'DataFrame' at 0x7f7eee6fe668>

这允许熊猫稍后执行complete['AgeGt15'] = complete['Age'] > 15时警告您,您可能正在修改对train无效的副本.对于初学者来说,这可能是一个有用的警告.在您的情况下,您似乎无意通过修改complete间接修改train.因此,警告对于您而言只是个无意义的烦恼.

您可以通过设置

使警告静音

complete.is_copy = False       # deprecated as of version 0.24

这比制作实际副本要快,并且可以将SettingWithCopyWarning夹在萌芽中(在 解决方案

Note: As of pandas version 0.24, is_copy is deprecated and will be removed in a future version. While the private attribute _is_copy exists, the underscore indicates this attribute is not part of the public API and therefore should not be depended upon. Therefore, going forward, it seems the only proper way to silence SettingWithCopyWarning will be to do so globally:

pd.options.mode.chained_assignment = None


When complete = train.dropna() is executed, dropna might return a copy, so out of an abundance of caution, Pandas sets complete.is_copy to a Truthy value:

In [220]: complete.is_copy
Out[220]: <weakref at 0x7f7f0b295b38; to 'DataFrame' at 0x7f7eee6fe668>

This allows Pandas to warn you later, when complete['AgeGt15'] = complete['Age'] > 15 is executed that you may be modifying a copy which will have no effect on train. For beginners this may be a useful warning. In your case, it appears you have no intention of modifying train indirectly by modifying complete. Therefore the warning is just a meaningless annoyance in your case.

You can silence the warning by setting,

complete.is_copy = False       # deprecated as of version 0.24

This is quicker than making an actual copy, and nips the SettingWithCopyWarning in the bud (at the point where _check_setitem_copy is called):

def _check_setitem_copy(self, stacklevel=4, t='setting', force=False):
    if force or self.is_copy:
        ...


If you are really confident you know what you are doing, you can shut off the SettingWithCopyWarning globally with

pd.options.mode.chained_assignment = None # None|'warn'|'raise'


An alternative way to silence the warning is to make a new copy:

complete = complete.copy()

However, you may not want to do this if the DataFrame is large, since copying can take a significant amount of time and memory, and it is completely pointless (except for the sake of silencing a warning) if you know complete is already a copy.

这篇关于即使使用.loc后, pandas 仍会收到SettingWithCopyWarning的文章就介绍到这了,希望我们推荐的答案对大家有所帮助,也希望大家多多支持IT屋!

查看全文
登录 关闭
扫码关注1秒登录
发送“验证码”获取 | 15天全站免登陆