如果未定义索引操作是返回视图还是副本，则 pandas 的视图有什么意义? [英] What is the point of views in pandas if it is undefined whether an indexing operation returns a view or a copy?

查看：110 发布时间：2020/5/24 0:13:45 python pandas views slice

本文介绍了如果未定义索引操作是返回视图还是副本，则 pandas 的视图有什么意义?的处理方法，对大家解决问题具有一定的参考价值，需要的朋友们下面随着小编来一起学习吧！

问题描述

我已经从R转到了熊猫.当我做类似的事情时，我通常会得到SettingWithCopyWarnings

I have switched from R to pandas. I routinely get SettingWithCopyWarnings, when I do something like

df_a = pd.DataFrame({'col1': [1,2,3,4]})    

# Filtering step, which may or may not return a view
df_b = df_a[df_a['col1'] > 1]

# Add a new column to df_b
df_b['new_col'] = 2 * df_b['col1']

# SettingWithCopyWarning!!

我想我理解问题所在，尽管我很乐意知道自己做错了什么.在给定的示例中，df_b是否为df_a上的视图是不确定的.因此，分配给df_b的效果尚不清楚:它会影响df_a吗?可以通过在过滤时显式制作一个副本来解决该问题:

I think I understand the problem, though I'll gladly learn what I got wrong. In the given example, it is undefined whether df_b is a view on df_a or not. Thus, the effect of assigning to df_b is unclear: does it affect df_a? The problem can be solved by explicitly making a copy when filtering:

df_a = pd.DataFrame({'col1': [1,2,3,4]})    

# Filtering step, definitely a copy now
df_b = df_a[df_a['col1'] > 1].copy()

# Add a new column to df_b
df_b['new_col'] = 2 * df_b['col1']

# No Warning now

我认为我缺少一些东西:如果我们永远无法真正确定是否创建视图，那么视图有什么用?摘自pandas文档( http://pandas -docs.github.io/pandas-docs-travis/indexing.html?highlight=view#indexing-view-versus-copy )

I think there is something that I am missing: if we can never really be sure whether we create a view or not, what are views good for? From the pandas documentation (http://pandas-docs.github.io/pandas-docs-travis/indexing.html?highlight=view#indexing-view-versus-copy)

除了简单的情况外，很难预测[ getitem ]将返回视图还是副本(取决于数组的内存布局，熊猫无法保证该数组)

Outside of simple cases, it’s very hard to predict whether it [getitem] will return a view or a copy (it depends on the memory layout of the array, about which pandas makes no guarantees)

对于不同的索引编制方法，可以找到类似的警告.

Similar warnings can be found for different indexing methods.

我发现在整个代码中散布.copy()调用非常麻烦且容易出错.我使用错误的样式来操纵我的DataFrames吗?还是性能提升如此之高以至于可以证明表面上的尴尬?

I find it very cumbersome and errorprone to sprinkle .copy() calls throughout my code. Am I using the wrong style for manipulating my DataFrames? Or is the performance gain so high that it justifies the apparent awkwardness?

如果未定义索引操作是返回视图还是副本，则 pandas 的视图有什么意义? [英] What is the point of views in pandas if it is undefined whether an indexing operation returns a view or a copy?

问题描述

推荐答案

相关文章

Python最新文章

热门教程

热门工具

登录关闭

如果未定义索引操作是返回视图还是副本，则 pandas 的视图有什么意义? [英] What is the point of views in pandas if it is undefined whether an indexing operation returns a view or a copy?

问题描述

推荐答案

相关文章

Python最新文章

热门教程

热门工具

登录 关闭

登录关闭