在python中基于pandas索引在新列中添加值 [英] adding values in new column based on indexes with pandas in python
问题描述
我刚刚接触熊猫,我正在尝试向现有数据框添加一个新列.
I'm just getting into pandas and I am trying to add a new column to an existing dataframe.
我有两个数据帧,其中一个数据帧的索引链接到另一个数据帧中的一列.如果这些值相等,我需要将源数据框中另一列的值放在目标列的新列中.
I have two dataframes where the index of one data frame links to a column in another dataframe. Where these values are equal I need to put the value of another column in the source dataframe in a new column of the destination column.
下面的代码部分说明了我的意思.注释部分是我需要的输出.
The code section below illustrates what I mean. The commented part is what I need as an output.
我想我需要 .loc[]
函数.
另一个次要问题:使用非唯一索引是不好的做法吗?
Another, minor, question: is it bad practice to have a non-unique indexes?
import pandas as pd
d = {'key':['a', 'b', 'c'],
'bar':[1, 2, 3]}
d2 = {'key':['a', 'a', 'b'],
'other_data':['10', '20', '30']}
df = pd.DataFrame(d)
df2 = pd.DataFrame(data = d2)
df2 = df2.set_index('key')
print df2
## other_data new_col
##key
##a 10 1
##a 20 1
##b 30 2
推荐答案
Use rename index
by Series
:
Use rename index
by Series
:
df2['new'] = df2.rename(index=df.set_index('key')['bar']).index
print (df2)
other_data new
key
a 10 1
a 20 1
b 30 2
或者 map
:
df2['new'] = df2.index.to_series().map(df.set_index('key')['bar'])
print (df2)
other_data new
key
a 10 1
a 20 1
b 30 2
如果想要更好的性能,最好避免索引中的重复.还有一些函数,如 reindex
在重复索引中失败.
If want better performance, the best is avoid duplicates in index. Also some function like reindex
failed in duplicates index.
这篇关于在python中基于pandas索引在新列中添加值的文章就介绍到这了,希望我们推荐的答案对大家有所帮助,也希望大家多多支持IT屋!