在python中基于pandas索引在新列中添加值 [英] adding values in new column based on indexes with pandas in python

查看:141
本文介绍了在python中基于pandas索引在新列中添加值的处理方法,对大家解决问题具有一定的参考价值,需要的朋友们下面随着小编来一起学习吧!

问题描述

我刚刚接触熊猫,我正在尝试向现有数据框添加一个新列.

I'm just getting into pandas and I am trying to add a new column to an existing dataframe.

我有两个数据帧,其中一个数据帧的索引链接到另一个数据帧中的一列.如果这些值相等,我需要将源数据框中另一列的值放在目标列的新列中.

I have two dataframes where the index of one data frame links to a column in another dataframe. Where these values are equal I need to put the value of another column in the source dataframe in a new column of the destination column.

下面的代码部分说明了我的意思.注释部分是我需要的输出.

The code section below illustrates what I mean. The commented part is what I need as an output.

我想我需要 .loc[] 函数.

另一个次要问题:使用非唯一索引是不好的做法吗?

Another, minor, question: is it bad practice to have a non-unique indexes?

import pandas as pd

d = {'key':['a',  'b', 'c'], 
     'bar':[1, 2, 3]}

d2 = {'key':['a', 'a', 'b'],
      'other_data':['10', '20', '30']}

df = pd.DataFrame(d)
df2 = pd.DataFrame(data = d2)
df2 = df2.set_index('key')

print df2

##    other_data  new_col
##key           
##a            10   1
##a            20   1
##b            30   2

推荐答案

Use rename index by Series:

Use rename index by Series:

df2['new'] = df2.rename(index=df.set_index('key')['bar']).index
print (df2)

    other_data  new
key                
a           10    1
a           20    1
b           30    2

或者 map:

df2['new'] = df2.index.to_series().map(df.set_index('key')['bar'])
print (df2)

    other_data  new
key                
a           10    1
a           20    1
b           30    2

如果想要更好的性能,最好避免索引中的重复.还有一些函数,如 reindex 在重复索引中失败.

If want better performance, the best is avoid duplicates in index. Also some function like reindex failed in duplicates index.

这篇关于在python中基于pandas索引在新列中添加值的文章就介绍到这了,希望我们推荐的答案对大家有所帮助,也希望大家多多支持IT屋!

查看全文
登录 关闭
扫码关注1秒登录
发送“验证码”获取 | 15天全站免登陆