强制返回“视图"而不是在 pandas 中复制? [英] Force Return of "View" rather than copy in Pandas?

查看:85
本文介绍了强制返回“视图"而不是在 pandas 中复制?的处理方法,对大家解决问题具有一定的参考价值,需要的朋友们下面随着小编来一起学习吧!

问题描述

从Pandas数据框中选择数据时,有时返回视图,有时返回副本. 尽管背后有逻辑,但是有没有办法强迫熊猫明确地返回视图或副本?

When selecting data from a Pandas dataframe, sometimes a view is returned and sometimes a copy is returned. While there is a logic behind this, is there a way to force Pandas to explicitly return a view or a copy?

推荐答案

您的问题分为两个部分:(1)如何制作视图(请参见本答案的底部),以及(2)如何制作副本.

There are two parts to your question: (1) how to make a view (see bottom of this answer), and (2) how to make a copy.

我将用一些示例数据进行演示:

I'll demonstrate with some example data:

import pandas as pd

df = pd.DataFrame([[1,2,3],[4,5,6],[None,10,20],[7,8,9]], columns=['x','y','z'])

# which looks like this:
     x   y   z
0   1   2   3
1   4   5   6
2 NaN  10  20
3   7   8   9

如何进行复制:一种选择是在执行任何操作后显式复制DataFrame.例如,假设我们正在选择不包含NaN的行:

How to make a copy: One option is to explicitly copy your DataFrame after whatever operations you perform. For instance, lets say we are selecting rows that do not have NaN:

df2 = df[~df['x'].isnull()]
df2 = df2.copy()

然后,如果您修改df2中的值,您会发现修改不会传播回原始数据(df),并且Pandas也未警告试图在切片的副本上设置值"来自DataFrame"

Then, if you modify values in df2 you will find that the modifications do not propagate back to the original data (df), and that Pandas does not warn that "A value is trying to be set on a copy of a slice from a DataFrame"

df2['x'] *= 100

# original data unchanged
print(df)

    x   y   z
0   1   2   3
1   4   5   6
2 NaN  10  20
3   7   8   9

# modified data
print(df2)

     x  y  z
0  100  2  3
1  400  5  6
3  700  8  9

注意:您可能需要显式地制作副本来提高性能.

Note: you may take a performance hit by explicitly making a copy.

如何忽略警告:或者,在某些情况下,您可能并不关心是否返回视图或副本,因为您的目的是永久修改数据,而从不返回原始数据.在这种情况下,您可以抑制该警告并乐意进行(只是不要忘记将其关闭,并且您的代码可能会或可能不会修改原始数据,因为df2可能会或可能不会作为副本):

How to ignore warnings: Alternatively, in some cases you might not care whether a view or copy is returned, because your intention is to permanently modify the data and never go back to the original data. In this case, you can suppress the warning and go merrily on your way (just don't forget that you've turned it off, and that the original data may or may not be modified by your code, because df2 may or may not be a copy):

pd.options.mode.chained_assignment = None  # default='warn'

有关更多信息,请参见>如何处理熊猫中的SettingWithCopyWarning的答案?

For more information, see the answers at How to deal with SettingWithCopyWarning in Pandas?

如何创建视图:大熊猫将在任何可能的地方隐式创建视图.关键是使用df.loc[row_indexer,col_indexer]方法.例如,对于仅列x不为空的行,要将列y的值乘以100,我们将编写:

How to make a view: Pandas will implicitly make views wherever and whenever possible. The key to this is to use the df.loc[row_indexer,col_indexer] method. For example, to multiply the values of column y by 100 for only the rows where column x is not null, we would write:

mask = ~df['x'].isnull()
df.loc[mask, 'y'] *= 100

# original data has changed
print(df)

     x    y   z
0  1.0  200   3
1  4.0  500   6
2  NaN   10  20
3  7.0  800   9

这篇关于强制返回“视图"而不是在 pandas 中复制?的文章就介绍到这了,希望我们推荐的答案对大家有所帮助,也希望大家多多支持IT屋!

查看全文
登录 关闭
扫码关注1秒登录
发送“验证码”获取 | 15天全站免登陆