如何处理 Pandas 中的 SettingWithCopyWarning [英] How to deal with SettingWithCopyWarning in Pandas
问题描述
我刚刚将 Pandas 从 0.11 升级到 0.13.0rc1.现在,该应用程序弹出了许多新警告.其中之一是这样的:
I just upgraded my Pandas from 0.11 to 0.13.0rc1. Now, the application is popping out many new warnings. One of them like this:
E:FinReporterFM_EXT.py:449: SettingWithCopyWarning: A value is trying to be set on a copy of a slice from a DataFrame.
Try using .loc[row_index,col_indexer] = value instead
quote_df['TVol'] = quote_df['TVol']/TVOL_SCALE
我想知道它到底是什么意思?我需要改变什么吗?
I want to know what exactly it means? Do I need to change something?
如果我坚持使用quote_df['TVol'] = quote_df['TVol']/TVOL_SCALE
,我应该如何暂停警告?
How should I suspend the warning if I insist to use quote_df['TVol'] = quote_df['TVol']/TVOL_SCALE
?
def _decode_stock_quote(list_of_150_stk_str):
"""decode the webpage and return dataframe"""
from cStringIO import StringIO
str_of_all = "".join(list_of_150_stk_str)
quote_df = pd.read_csv(StringIO(str_of_all), sep=',', names=list('ABCDEFGHIJKLMNOPQRSTUVWXYZabcdefg')) #dtype={'A': object, 'B': object, 'C': np.float64}
quote_df.rename(columns={'A':'STK', 'B':'TOpen', 'C':'TPCLOSE', 'D':'TPrice', 'E':'THigh', 'F':'TLow', 'I':'TVol', 'J':'TAmt', 'e':'TDate', 'f':'TTime'}, inplace=True)
quote_df = quote_df.ix[:,[0,3,2,1,4,5,8,9,30,31]]
quote_df['TClose'] = quote_df['TPrice']
quote_df['RT'] = 100 * (quote_df['TPrice']/quote_df['TPCLOSE'] - 1)
quote_df['TVol'] = quote_df['TVol']/TVOL_SCALE
quote_df['TAmt'] = quote_df['TAmt']/TAMT_SCALE
quote_df['STK_ID'] = quote_df['STK'].str.slice(13,19)
quote_df['STK_Name'] = quote_df['STK'].str.slice(21,30)#.decode('gb2312')
quote_df['TDate'] = quote_df.TDate.map(lambda x: x[0:4]+x[5:7]+x[8:10])
return quote_df
更多错误信息
E:FinReporterFM_EXT.py:449: SettingWithCopyWarning: A value is trying to be set on a copy of a slice from a DataFrame.
Try using .loc[row_index,col_indexer] = value instead
quote_df['TVol'] = quote_df['TVol']/TVOL_SCALE
E:FinReporterFM_EXT.py:450: SettingWithCopyWarning: A value is trying to be set on a copy of a slice from a DataFrame.
Try using .loc[row_index,col_indexer] = value instead
quote_df['TAmt'] = quote_df['TAmt']/TAMT_SCALE
E:FinReporterFM_EXT.py:453: SettingWithCopyWarning: A value is trying to be set on a copy of a slice from a DataFrame.
Try using .loc[row_index,col_indexer] = value instead
quote_df['TDate'] = quote_df.TDate.map(lambda x: x[0:4]+x[5:7]+x[8:10])
推荐答案
SettingWithCopyWarning
旨在标记可能令人困惑的链式"分配,例如以下,并不总是按预期工作,特别是当第一个选择返回 copy 时.[参见 GH5390 和 GH5597 用于背景讨论.]
The SettingWithCopyWarning
was created to flag potentially confusing "chained" assignments, such as the following, which does not always work as expected, particularly when the first selection returns a copy. [see GH5390 and GH5597 for background discussion.]
df[df['A'] > 2]['B'] = new_val # new_val not set in df
警告提供了如下重写的建议:
The warning offers a suggestion to rewrite as follows:
df.loc[df['A'] > 2, 'B'] = new_val
但是,这不适合您的用法,相当于:
However, this doesn't fit your usage, which is equivalent to:
df = df[df['A'] > 2]
df['B'] = new_val
虽然很明显您不关心将其返回到原始帧的写入(因为您覆盖了对它的引用),但不幸的是,这种模式无法与第一个链式赋值示例区分开来.因此(误报)警告.有关索引的文档,如果您想进一步阅读.您可以通过以下分配安全地禁用此新警告.
While it's clear that you don't care about writes making it back to the original frame (since you are overwriting the reference to it), unfortunately this pattern cannot be differentiated from the first chained assignment example. Hence the (false positive) warning. The potential for false positives is addressed in the docs on indexing, if you'd like to read further. You can safely disable this new warning with the following assignment.
import pandas as pd
pd.options.mode.chained_assignment = None # default='warn'
其他资源
- pandas 用户指南:索引和选择数据莉>
- Python 数据科学手册:数据索引和选择
- 真正的 Python:Pandas 中的 SettingWithCopyWarning:视图与副本
- Dataquest:SettingwithCopyWarning:如何在 Pandas 中修复此警告
- 迈向数据科学:解释 Pandas 中的 SettingWithCopyWarning
这篇关于如何处理 Pandas 中的 SettingWithCopyWarning的文章就介绍到这了,希望我们推荐的答案对大家有所帮助,也希望大家多多支持IT屋!