如果索引操作返回的是视图还是副本是未定义的，那么 Pandas 中的观点是什么? [英] What is the point of views in pandas if it is undefined whether an indexing operation returns a view or a copy?

查看：27 发布时间：2021/12/29 8:46:22 python pandas views slice

本文介绍了如果索引操作返回的是视图还是副本是未定义的，那么 Pandas 中的观点是什么?的处理方法，对大家解决问题具有一定的参考价值，需要的朋友们下面随着小编来一起学习吧！

问题描述

我已从 R 切换到 Pandas.当我做类似的事情时，我经常得到 SettingWithCopyWarnings

I have switched from R to pandas. I routinely get SettingWithCopyWarnings, when I do something like

df_a = pd.DataFrame({'col1': [1,2,3,4]})    

# Filtering step, which may or may not return a view
df_b = df_a[df_a['col1'] > 1]

# Add a new column to df_b
df_b['new_col'] = 2 * df_b['col1']

# SettingWithCopyWarning!!

我想我明白这个问题了，尽管我很乐意知道我做错了什么.在给定的示例中，未定义 df_b 是否是 df_a 上的视图.因此，分配给 df_b 的效果尚不清楚:它会影响 df_a 吗?这个问题可以通过在过滤时显式复制来解决:

I think I understand the problem, though I'll gladly learn what I got wrong. In the given example, it is undefined whether df_b is a view on df_a or not. Thus, the effect of assigning to df_b is unclear: does it affect df_a? The problem can be solved by explicitly making a copy when filtering:

df_a = pd.DataFrame({'col1': [1,2,3,4]})    

# Filtering step, definitely a copy now
df_b = df_a[df_a['col1'] > 1].copy()

# Add a new column to df_b
df_b['new_col'] = 2 * df_b['col1']

# No Warning now

我认为我遗漏了一些东西:如果我们永远无法确定是否创建了视图，那么视图有什么用?来自熊猫文档(http://pandas-docs.github.io/pandas-docs-travis/indexing.html?highlight=view#indexing-view-versus-copy)

I think there is something that I am missing: if we can never really be sure whether we create a view or not, what are views good for? From the pandas documentation (http://pandas-docs.github.io/pandas-docs-travis/indexing.html?highlight=view#indexing-view-versus-copy)

除了简单的情况，很难预测它 [getitem] 会返回一个视图还是一个副本(这取决于数组的内存布局，pandas 对此不做任何保证)

Outside of simple cases, it’s very hard to predict whether it [getitem] will return a view or a copy (it depends on the memory layout of the array, about which pandas makes no guarantees)

可以为不同的索引方法找到类似的警告.

Similar warnings can be found for different indexing methods.

我发现在我的代码中散布 .copy() 调用非常麻烦且容易出错.我是否使用了错误的样式来操作我的 DataFrame?还是性能提升如此之高，以至于可以证明明显的尴尬?

I find it very cumbersome and errorprone to sprinkle .copy() calls throughout my code. Am I using the wrong style for manipulating my DataFrames? Or is the performance gain so high that it justifies the apparent awkwardness?

如果索引操作返回的是视图还是副本是未定义的，那么 Pandas 中的观点是什么? [英] What is the point of views in pandas if it is undefined whether an indexing operation returns a view or a copy?

问题描述

推荐答案

相关文章

Python最新文章

热门教程

热门工具

登录关闭

如果索引操作返回的是视图还是副本是未定义的，那么 Pandas 中的观点是什么? [英] What is the point of views in pandas if it is undefined whether an indexing operation returns a view or a copy?

问题描述

推荐答案

相关文章

Python最新文章

热门教程

热门工具

登录 关闭

登录关闭