修改列表中的dataFrames无效 [英] Modifying dataFrames inside a list is not working
问题描述
我有两个 DataFrames
,我想执行相同的清理操作列表。
我意识到我可以合并为一个,并且可以合并到一次通过,但我仍然很好奇为什么这个方法不起作用
I have two DataFrames
and I want to perform the same list of cleaning ops.
I realized I can merge into one, and to everything in one pass, but I am still curios why this method is not working
test_1 = pd.DataFrame({
"A": [1, 8, 5, 6, 0],
"B": [15, 49, 34, 44, 63]
})
test_2 = pd.DataFrame({
"A": [np.nan, 3, 6, 4, 9, 0],
"B": [-100, 100, 200, 300, 400, 500]
})
让我们假设我只想在没有 NaN
的情况下拍摄原始照片:我试过了
Let's assume I want to only take the raws without NaN
s: I tried
for df in [test_1, test_2]:
df = df[pd.notnull(df["A"])]
但 test_2
保持不变。另一方面,如果我这样做:
but test_2
is left untouched. On the other hand if I do:
test_2 = test_2[pd.notnull(test_2["A"])]
现在我的第一个原始消失了。
Now I the first raw went away.
推荐答案
所有这些切片/索引操作都会创建原始数据帧的视图/副本,然后重新分配 <$ em> c> df 这些观点/副本,意味着原文不会被触及。
All these slicing/indexing operations create views/copies of the original dataframe and you then reassign df
to these views/copies, meaning the originals are not touched at all.
选项1
dropna(... inplace = True)
尝试就地 dropna
调用,这应修改原文就地对象
Option 1
dropna(...inplace=True)
Try an in-place dropna
call, this should modify the original object in-place
df_list = [test_1, test_2]
for df in df_list:
df.dropna(subset=['A'], inplace=True)
注意,这是少数几次我将推荐就地修改,特别是因为这个用例。
Note, this is one of the few times that I will ever recommend an in-place modification, because of this use case in particular.
选项2
枚举
with重新分配
或者,您可以重新分配到列表 -
Option 2
enumerate
with reassignment
Alternatively, you may re-assign to the list -
for i, df in enumerate(df_list):
df_list[i] = df.dropna(subset=['A']) # df_list[i] = df[df.A.notnull()]
这篇关于修改列表中的dataFrames无效的文章就介绍到这了,希望我们推荐的答案对大家有所帮助,也希望大家多多支持IT屋!