在Pandas中分配列时处理SettingWithCopyWarning [英] Dealing with SettingWithCopyWarning when assigning columns in Pandas
问题描述
我有一个DataFrame
,我想用包含上一行数据的列进行扩展.
I have a DataFrame
which I want to extend with columns that contain data from the previous row.
此脚本可以完成这项工作:
This script does the job:
#!/usr/bin/env python3
import numpy as np
import pandas as pd
n = 2
df = pd.DataFrame({'A': [1,2,3,4,5], 'B': [0,1,1,0,0]}, columns=['A', 'B'])
df2 = df[df['B'] == 0]
print(df2)
for i in range(1, n+1):
df2['A_%d' % i] = df2['A'].shift(i)
print(df2)
它输出:
A B
0 1 0
3 4 0
4 5 0
A B A_1 A_2
0 1 0 NaN NaN
3 4 0 1.0 NaN
4 5 0 4.0 1.0
这正是我想要的. DataFrame
现在具有另外两个列A_1
和A_2
,它们之前包含列A
1 和 2 的值.
which is exactly what I want. The DataFrame
now has two additional columns A_1
and A_2
that contain the value of column A
1 and 2 rows before.
但是,我也收到警告:
./my_script.py:14: SettingWithCopyWarning:
A value is trying to be set on a copy of a slice from a DataFrame.
Try using .loc[row_indexer,col_indexer] = value instead
See the caveats in the documentation: http://pandas.pydata.org/pandas-docs/stable/indexing.html#indexing-view-versus-copy
df2['A_%d' % i] = df2['A'].shift(i)
问题肯定来自我创建df2
之前的过滤.如果我直接在df
上工作,则不会发生此问题.
在我的应用程序中,我需要分别处理原始DataFrame的多个部分,因此绝对需要进行过滤.所有不同的部分(如此处的df2
)稍后会串联在一起.
The problem definitely comes from the filtering before where I create df2
. If I work on df
directly, the problem does not occur.
In my application I need to work on multiple parts of my original DataFrame separately ant therefore the filtering and is definitely required. All the different parts (like df2
here) get concatenated later.
我在>如何处理熊猫中的SettingWithCopyWarning问题中找到了类似的问题?和 Pandas SettingWithCopyWarning ,但是那里的解决方案不能解决问题.
I found similar issues in How to deal with SettingWithCopyWarning in Pandas? and Pandas SettingWithCopyWarning but the solutions from there do not fix the problem.
例如写作
df2[:, 'A_%d' % i] = df2['A'].shift(i)
同样的警告仍然发生.
我正在使用Python 3.5.2和Pandas 0.19.2
I am working with Python 3.5.2 and Pandas 0.19.2
推荐答案
I think you need copy
:
df2 = df[df['B'] == 0].copy()
如果稍后在df2
中修改值,您会发现修改不会传播回原始数据(df
),并且Pandas会发出警告.
If you modify values in df2
later you will find that the modifications do not propagate back to the original data (df
), and that Pandas does warning.
这篇关于在Pandas中分配列时处理SettingWithCopyWarning的文章就介绍到这了,希望我们推荐的答案对大家有所帮助,也希望大家多多支持IT屋!