检查数据框是在Pandas中复制还是查看 [英] Checking whether data frame is copy or view in Pandas
问题描述
是否有一种简单的方法来检查两个数据帧是否是不涉及操作的同一基础数据的不同副本或视图?我正在尝试掌握每一个生成的时间,并且鉴于规则看起来有多特殊,我想要一种简单的测试方法.
Is there an easy way to check whether two data frames are different copies or views of the same underlying data that doesn't involve manipulations? I'm trying to get a grip on when each is generated, and given how idiosyncratic the rules seem to be, I'd like an easy way to test.
例如,我认为"id(df.values)"在各个视图中都将是稳定的,但它们似乎并非如此:
For example, I thought "id(df.values)" would be stable across views, but they don't seem to be:
# Make two data frames that are views of same data.
df = pd.DataFrame([[1,2,3,4],[5,6,7,8]], index = ['row1','row2'],
columns = ['a','b','c','d'])
df2 = df.iloc[0:2,:]
# Demonstrate they are views:
df.iloc[0,0] = 99
df2.iloc[0,0]
Out[70]: 99
# Now try and compare the id on values attribute
# Different despite being views!
id(df.values)
Out[71]: 4753564496
id(df2.values)
Out[72]: 4753603728
# And we can of course compare df and df2
df is df2
Out[73]: False
我查找的其他答案试图给出规则,但看起来不一致,也没有回答如何测试的问题:
Other answers I've looked up that try to give rules, but don't seem consistent, and also don't answer this question of how to test:
当然: - http://pandas. pydata.org/pandas-docs/stable/indexing.html#returning-a-view-versus-a-copy
更新:以下评论似乎可以回答问题-查看df.values.base
属性而不是df.values
属性,就像对df._is_copy
属性的引用一样(尽管后者可能是非常糟糕的形式,因为它是内部的.
UPDATE: Comments below seem to answer the question -- looking at the df.values.base
attribute rather than df.values
attribute does it, as does a reference to the df._is_copy
attribute (though the latter is probably very bad form since it's an internal).
推荐答案
HYRY和Marius的评论中有答案!
Answers from HYRY and Marius in comments!
一个人可以通过以下方式之一进行检查:
One can check either by:
-
测试
values.base
属性而不是values
属性的等效性,如:
testing equivalence of the
values.base
attribute rather than thevalues
attribute, as in:
df.values.base is df2.values.base
而不是df.values is df2.values
.
谢谢大家!
这篇关于检查数据框是在Pandas中复制还是查看的文章就介绍到这了,希望我们推荐的答案对大家有所帮助,也希望大家多多支持IT屋!