使用字典替换 Pandas 数据帧上给定索引号的列值 [英] Using a dictionary to replace column values on given index numbers on a pandas dataframe
问题描述
考虑以下数据框
df_test = pd.DataFrame( {'a' : [1, 2, 8], 'b' : [np.nan, np.nan, 5], 'c' : [np.nan, np.nan, 4]})
df_test.index = ['one', 'two', 'three']
给出
a b c
one 1 NaN NaN
two 2 NaN NaN
three 8 5 4
我有一个列 b 和 c 的行替换字典.例如:
I have a dictionary of row replacements for columns b and c. For example:
{ 'one': [3.1, 2.2], 'two' : [8.8, 4.4] }
其中 3.1 和 8.8 替换 b 列,2.2 和 4.4 替换 c 列,结果为
where 3.1 and 8.8 replaces column b and 2.2 and 4.4 replaces column c, so that the result is
a b c
one 1 3.1 2.2
two 2 8.8 4.4
three 8 5 4
我知道如何使用 for 循环进行这些更改:
I know how to make these changes with a for loop:
index_list = ['one', 'two']
value_list_b = [3.1, 8.8]
value_list_c = [2.2, 4.4]
for i in range(len(index_list)):
df_test.ix[df_test.index == index_list[i], 'b'] = value_list_b[i]
df_test.ix[df_test.index == index_list[i], 'c'] = value_list_c[i]
但我相信有一种更好更快的方式来使用字典!
but I'm sure there's a nicer and quicker way to use the dictionary!
我想可以使用 DataFrame.replace 方法来完成,但我无法弄清楚.
I guess it can be done with the DataFrame.replace method, but I couldn't figure it out.
感谢您的帮助,
光盘
推荐答案
您正在寻找 pandas.DataFrame.update
.在您的情况下唯一的扭曲是您将更新指定为行字典,而 DataFrame 通常是从列字典构建的.orient
关键字可以解决这个问题.
You are looking for pandas.DataFrame.update
. The only twist in your case is that you specify the updates as a dictionary of rows, whereas a DataFrame is usually built from a dictionary of columns. The orient
keyword can handle that.
In [24]: import pandas as pd
In [25]: df_test
Out[25]:
a b c
one 1 NaN NaN
two 2 NaN NaN
three 8 5 4
In [26]: row_replacements = { 'one': [3.1, 2.2], 'two' : [8.8, 4.4] }
In [27]: df_update = pd.DataFrame.from_dict(row_replacements, orient='index')
In [28]: df_update.columns = ['b', 'c']
In [29]: df_update
Out[29]:
b c
one 3.1 2.2
two 8.8 4.4
In [30]: df_test.update(df_update)
In [31]: df_test
Out[31]:
a b c
one 1 3.1 2.2
two 2 8.8 4.4
three 8 5.0 4.0
pandas.DataFrame.from_dict
是一个特定的 DataFrame 构造函数,它为我们提供了 orient
关键字,如果您只说 DataFrame(...)
,则不可用.由于我不知道的原因,我们无法将列名 ['b', 'c']
传递给 from_dict
,因此我在单独的步骤中指定了它们.
pandas.DataFrame.from_dict
is a specific DataFrame constructor that gives us the orient
keyword, not available if you just say DataFrame(...)
. For reasons I don't know, we can't pass column names ['b', 'c']
to from_dict
, so I specified them in separate step.
这篇关于使用字典替换 Pandas 数据帧上给定索引号的列值的文章就介绍到这了,希望我们推荐的答案对大家有所帮助,也希望大家多多支持IT屋!