Pandas DataFrame 可变性 [英] Pandas DataFrame mutability

查看:58
本文介绍了Pandas DataFrame 可变性的处理方法,对大家解决问题具有一定的参考价值,需要的朋友们下面随着小编来一起学习吧!

问题描述

我对 Panda 的 Dataframe 还很陌生,如果有人能通过以下示例向我简要讨论 DataFrame 的可变性,我将不胜感激:

I am pretty new to Panda's Dataframe and it would be highly appreciated if someone can briefly discuss about the mutability of DataFrame to me with the following example:

d1=pd.date_range('1/1/2016',periods=10,freq='w')
col1=['open','high','low','close']
list1=np.random.rand(10,4)
df1=pd.DataFrame(list1,d1,col1)

据我所知,目前 df1 是对 df 对象的引用.

To my understanding, currently df1 is a reference to a df object.

如果我传递 df1 或 df1 的切片(例如 df1.iloc[2:3,1:2])作为新 df 的输入,(例如 df2=pd.DataFrame(df1)),df2 是返回一个新的数据帧实例还是仍然引用 df1 使 df1 暴露给 df2?

If I pass df1 or slicing of df1 (e.g. df1.iloc[2:3,1:2]) as an input to a new df, (e.g. df2=pd.DataFrame(df1)), does df2 return a new instance of dataframe or it is still referring to df1 that makes df1 exposed to df2?

关于 DataFrame 的可变性我应该注意的任何其他点也将不胜感激.

Also any other point that I should pay attention to regarding mutability of DataFrame will be very much appreciated.

推荐答案

这个:

df2 = pd.DataFrame(df1)

构造一个新的 DataFrame.有一个 copy 参数,它的默认参数是 False.根据文档,这意味着:

Constructs a new DataFrame. There is a copy parameter whose default argument is False. According to the documentation, it means:

> Copy data from inputs. Only affects DataFrame / 2d ndarray input

因此默认情况下,数据将在 df2df1 之间共享.如果您不希望共享,而是想要完整的副本,请执行以下操作:

So data will be shared between df2 and df1 by default. If you want there to be no sharing, but rather a complete copy, do this:

df2 = pd.DataFrame(df1, copy=True)

或者更简洁和地道:

df2 = df1.copy()

如果你这样做:

df2 = df1.iloc[2:3,1:2].copy()

您将再次获得独立副本.但是如果你这样做:

You will again get an independent copy. But if you do this:

df2 = pd.DataFrame(df1.iloc[2:3,1:2])

它可能会共享数据,但是如果您打算修改df,这种样式非常不清楚,因此我建议不要编写此类代码.相反,如果你不想复制,就这样说:

It will probably share the data, but this style is pretty unclear if you intend to modify df, so I suggest not writing such code. Instead, if you want no copy, just say this:

df2 = df1.iloc[2:3,1:2]

总而言之:如果您想引用现有数据,请不要调用 pd.DataFrame() 或任何其他方法.如果你想要一个独立的副本,调用 .copy().

In summary: if you want a reference to existing data, do not call pd.DataFrame() or any other method at all. If you want an independent copy, call .copy().

这篇关于Pandas DataFrame 可变性的文章就介绍到这了,希望我们推荐的答案对大家有所帮助,也希望大家多多支持IT屋!

查看全文
登录 关闭
扫码关注1秒登录
发送“验证码”获取 | 15天全站免登陆