在 pandas 中重新分配:复制还是查看? [英] Re-assignment in Pandas: Copy or view?

查看:69
本文介绍了在 pandas 中重新分配:复制还是查看?的处理方法,对大家解决问题具有一定的参考价值,需要的朋友们下面随着小编来一起学习吧!

问题描述

假设我们有以下数据框:

Say we have the following dataframe:

df = pd.DataFrame({'A' : ['foo', 'bar', 'foo', 'bar',
                          'foo', 'bar', 'foo', 'foo'],  
                   'B' : ['one', 'one', 'two', 'three',
                          'two', 'two', 'one', 'three'],
                   'C' : randn(8), 'D' : randn(8)})

如下所示:

> df
     A      B         C         D
0  foo    one  0.846192  0.478651
1  bar    one  2.352421  0.141416
2  foo    two -1.413699 -0.577435
3  bar  three  0.569572 -0.508984
4  foo    two -1.384092  0.659098
5  bar    two  0.845167 -0.381740
6  foo    one  3.355336 -0.791471
7  foo  three  0.303303  0.452966

然后我执行以下操作:

df2 = df
df  = df[df['C']>0]

如果现在查看dfdf2,您会看到df2保存了原始数据,而df已更新为仅保留C所在的值大于0.

If you now look at df and df2 you will see that df2 holds the original data, whereas df was updated to only keep the values where C was greater than 0.

我以为Pandas不应像df2 = df这样的作业来复制,而只能使用以下任何一种进行复制:

I thought Pandas wasn't supposed to make a copy in an assignment like df2 = df and that it would only make copies with either:

  1. df2 = df.copy(deep=True)
  2. df2 = copy.deepcopy(df)
  1. df2 = df.copy(deep=True)
  2. df2 = copy.deepcopy(df)

那上面发生了什么? df2 = df做了副本吗?我认为答案是,因此肯定是df = df[df['C']>0]制作了副本,而且我假设,如果我上面没有df2=df,那肯定会有副本. 没有任何引用,它会在内存中浮动.正确吗?

What happened above then? Did df2 = df make a copy? I presume that the answer is no, so it must have been df = df[df['C']>0] that made a copy, and I presume that, if I didn't have df2=df above, there would have been a copy without any reference to it floating in memory. Is that correct?

注意:我已阅读返回视图而不是副本,我想知道是否存在以下情况:

Note: I read through Returning a view versus a copy and I wonder if the following:

每当索引操作涉及标签数组或布尔向量时,结果将是副本.

解释了此行为.

推荐答案

不是df2正在制作副本,而是df = df[df['C'] > 0]正在返回副本.

It's not that df2 is making the copy, it's that the df = df[df['C'] > 0] is returning a copy.

只需打印出ID,您就会看到:

Just print out the ids and you'll see:

print id(df)
df2 = df
print id(df2)
df = df[df['C'] > 0]
print id(df)

这篇关于在 pandas 中重新分配:复制还是查看?的文章就介绍到这了,希望我们推荐的答案对大家有所帮助,也希望大家多多支持IT屋!

查看全文
登录 关闭
扫码关注1秒登录
发送“验证码”获取 | 15天全站免登陆