检查数据框是在Pandas中复制还是查看 [英] Checking whether data frame is copy or view in Pandas

查看:121
本文介绍了检查数据框是在Pandas中复制还是查看的处理方法,对大家解决问题具有一定的参考价值,需要的朋友们下面随着小编来一起学习吧!

问题描述

是否有一种简单的方法来检查两个数据帧是否是不涉及操作的同一基础数据的不同副本或视图?我正在尝试掌握每一个生成的时间,并且鉴于规则看起来有多特殊,我想要一种简单的测试方法.

Is there an easy way to check whether two data frames are different copies or views of the same underlying data that doesn't involve manipulations? I'm trying to get a grip on when each is generated, and given how idiosyncratic the rules seem to be, I'd like an easy way to test.

例如,我认为"id(df.values)"在各个视图中都将是稳定的,但它们似乎并非如此:

For example, I thought "id(df.values)" would be stable across views, but they don't seem to be:

# Make two data frames that are views of same data.
df = pd.DataFrame([[1,2,3,4],[5,6,7,8]], index = ['row1','row2'], 
       columns = ['a','b','c','d'])
df2 = df.iloc[0:2,:]

# Demonstrate they are views:
df.iloc[0,0] = 99
df2.iloc[0,0]
Out[70]: 99

# Now try and compare the id on values attribute
# Different despite being views! 

id(df.values)
Out[71]: 4753564496

id(df2.values)
Out[72]: 4753603728

# And we can of course compare df and df2
df is df2
Out[73]: False

我查找的其他答案试图给出规则,但看起来不一致,也没有回答如何测试的问题:

Other answers I've looked up that try to give rules, but don't seem consistent, and also don't answer this question of how to test:

熊猫:对数据帧进行索引:副本与视图

了解熊猫数据框索引

在熊猫中重新分配:复制还是查看?

当然: - http://pandas. pydata.org/pandas-docs/stable/indexing.html#returning-a-view-versus-a-copy

更新:以下评论似乎可以回答问题-查看df.values.base属性而不是df.values属性,就像对df._is_copy属性的引用一样(尽管后者可能是非常糟糕的形式,因为它是内部的.

UPDATE: Comments below seem to answer the question -- looking at the df.values.base attribute rather than df.values attribute does it, as does a reference to the df._is_copy attribute (though the latter is probably very bad form since it's an internal).

推荐答案

HYRY和Marius的评论中有答案!

Answers from HYRY and Marius in comments!

一个人可以通过以下方式之一进行检查:

One can check either by:

  • 测试values.base属性而不是values属性的等效性,如:

  • testing equivalence of the values.base attribute rather than the values attribute, as in:

df.values.base is df2.values.base而不是df.values is df2.values.

谢谢大家!

这篇关于检查数据框是在Pandas中复制还是查看的文章就介绍到这了,希望我们推荐的答案对大家有所帮助,也希望大家多多支持IT屋!

查看全文
登录 关闭
扫码关注1秒登录
发送“验证码”获取 | 15天全站免登陆