pandas :为数据框建立索引:副本与视图 [英] Pandas: Subindexing dataframes: Copies vs views

查看:67
本文介绍了 pandas :为数据框建立索引:副本与视图的处理方法,对大家解决问题具有一定的参考价值,需要的朋友们下面随着小编来一起学习吧!

问题描述

说我有一个数据框

import pandas as pd
import numpy as np
foo = pd.DataFrame(np.random.random((10,5)))

然后从我的数据子集中创建另一个数据框:

and I create another dataframe from a subset of my data:

bar = foo.iloc[3:5,1:4]

bar是否保存来自foo的那些元素的副本?有什么办法代替创建该数据的view吗?如果是这样,如果我尝试在此视图中修改数据会怎样?熊猫是否提供写时复制机制?

does bar hold a copy of those elements from foo? Is there any way to create a view of that data instead? If so, what would happen if I try to modify data in this view? Does Pandas provide any sort of copy-on-write mechanism?

推荐答案

您的答案出在熊猫文档中:

Your answer lies in the pandas docs: returning-a-view-versus-a-copy.

每当涉及标签数组或布尔向量时 在索引操作中,结果将是副本. 通过单标签/标量索引和切片, 例如df.ix [3:6]或df.ix [:,'A'],将返回一个视图.

Whenever an array of labels or a boolean vector are involved in the indexing operation, the result will be a copy. With single label / scalar indexing and slicing, e.g. df.ix[3:6] or df.ix[:, 'A'], a view will be returned.

在您的示例中,barfoo的切片的视图.如果您想要副本,则可以使用copy方法.修改bar也会修改foo.熊猫似乎没有写时复制机制.

In your example, bar is a view of slices of foo. If you wanted a copy, you could have used the copy method. Modifying bar also modifies foo. pandas does not appear to have a copy-on-write mechanism.

请参见下面的代码示例进行说明:

See my code example below to illustrate:

In [1]: import pandas as pd
   ...: import numpy as np
   ...: foo = pd.DataFrame(np.random.random((10,5)))
   ...: 

In [2]: pd.__version__
Out[2]: '0.12.0.dev-35312e4'

In [3]: np.__version__
Out[3]: '1.7.1'

In [4]: # DataFrame has copy method
   ...: foo_copy = foo.copy()

In [5]: bar = foo.iloc[3:5,1:4]

In [6]: bar == foo.iloc[3:5,1:4] == foo_copy.iloc[3:5,1:4]
Out[6]: 
      1     2     3
3  True  True  True
4  True  True  True

In [7]: # Changing the view
   ...: bar.ix[3,1] = 5

In [8]: # View and DataFrame still equal
   ...: bar == foo.iloc[3:5,1:4]
Out[8]: 
      1     2     3
3  True  True  True
4  True  True  True

In [9]: # It is now different from a copy of original
   ...: bar == foo_copy.iloc[3:5,1:4]
Out[9]: 
       1     2     3
3  False  True  True
4   True  True  True

这篇关于 pandas :为数据框建立索引:副本与视图的文章就介绍到这了,希望我们推荐的答案对大家有所帮助,也希望大家多多支持IT屋!

查看全文
登录 关闭
扫码关注1秒登录
发送“验证码”获取 | 15天全站免登陆