比较 pandas 数据框并添加列 [英] Compare Pandas dataframes and add column

查看:80
本文介绍了比较 pandas 数据框并添加列的处理方法,对大家解决问题具有一定的参考价值,需要的朋友们下面随着小编来一起学习吧!

问题描述

我有两个数据框,如下所示

I have two dataframe as below

df1     df2 
A       A   C
A1      A1  C1
A2      A2  C2
A3      A3  C3
A1      A4  C4
A2          
A3          
A4          

列"A"的值在列"C"的df2中定义. 我想用B列向df1添加一个新列,其值来自df2列"C"

The values of column 'A' are defined in df2 in column 'C'. I want to add a new column to df1 with column B with its value from df2 column 'C'

最终的df1应该看起来像这样

The final df1 should look like this

df1
A   B
A1  C1
A2  C2
A3  C3
A1  C1
A2  C2
A3  C3
A4  C4

我可以遍历df2并将其添加到df1中,但由于数据量巨大,因此非常耗时.

I can loop over df2 and add the value to df1 but its time consuming as the data is huge.

    for index, row in df2.iterrows():
           df1.loc[df1.A.isin([row['A']]), 'B']= row['C']

有人可以帮助我了解如何解决此问题而无需遍历df2.

Can someone help me to understand how can I solve this without looping over df2.

谢谢

推荐答案

IIUC,您可以合并并重命名col

IIUC you can just merge and rename the col

df1.merge(df2, on='A', how='left').rename(columns={'C':'B'})

In [103]:
df1 = pd.DataFrame({'A':['A1','A2','A3','A1','A2','A3','A4']})
df2 = pd.DataFrame({'A':['A1','A2','A3','A4'], 'C':['C1','C2','C4','C4']})
merged = df1.merge(df2, on='A', how='left').rename(columns={'C':'B'})
merged

Out[103]:
    A   B
0  A1  C1
1  A2  C2
2  A3  C4
3  A1  C1
4  A2  C2
5  A3  C4
6  A4  C4

这篇关于比较 pandas 数据框并添加列的文章就介绍到这了,希望我们推荐的答案对大家有所帮助,也希望大家多多支持IT屋!

查看全文
登录 关闭
扫码关注1秒登录
发送“验证码”获取 | 15天全站免登陆