映射值和基于多个列值合并数据框 [英] Mapping values and merging dataframes based on multiple column values
问题描述
我有以下数据框:
df1:带有患者关键提示的数据框
df1: dataframe with patient critical notes
AREA DATE_TIME CRITICAL ISSUE NOTES
0013 11/6/2017 2:25:00 P.M Nurse attended to the patient
1121 10/23/2017 6:43:00 A.M Completed an ER
1121 10/2/2017 9:30:00 P.M Admitted
df2:患者其他详细信息
df2: Patient other details
ZIP TIME_NOTED NAME OCCUPIED STATE
4568 10/1/2017 10:04:00 A.M Chris Y NORMAL
1121 10/23/2017 6:43:00 A.M Nancy Y CRITICAL
1121 10/2/2017 9:30:00 P.M Derek N CRITICAL
我必须使用df1中的DATE_TIME和AREA代码映射df2中的记录,并且还要在两个数据帧中保留所有其他列.我尝试合并多个列,但没有按预期工作.
I have to map the records in df2 using DATE_TIME and AREA code from df1 and also retain all other columns in both dataframes. I tried merging on multiple columns but didnt work as expected.
new_df = pd.merge(df1, df2, how='right', left_on=['Date_Time','AREA'], right_on = ['ZIP','TIME_NOTED'])
推荐答案
如果您对left/right_on(区域/zip,然后注明日期时间/时间)以相同的顺序排列列,则应该可以使用.我还将合并更改为内部,因此您只需要获得记录的邮政编码/区域和日期/时间都相同的记录即可.
If you put the columns in the same order for both left/right_on (area/zip then date time/time noted) it should work. I also changed the merge to an inner, so you just get records with the same zip/area and date time/time noted.
new_df = pd.merge(df1, df2, how='inner', left_on = ['AREA','DATE_TIME'], right_on = ['ZIP','TIME_NOTED'])
另一种可能的解决方案是创建一个"ID"列并将其合并.
Another potential solution would be creating an "ID" column and merging on that.
df1['ID'] = df1['AREA'].astype(str) + '_' + df1['DATE_TIME'].astype(str)
df2['ID'] = df2['ZIP'].astype(str) + '_' + df2['TIME_NOTED'].astype(str)
现在合并ID
new_df = pd.merge(df1, df2, how = 'inner',left_on = ['ID'], right_on = ['ID'])
这应该产生相同的表(带有ID列).
This should yield the same table (with the addition of an ID column).
这篇关于映射值和基于多个列值合并数据框的文章就介绍到这了,希望我们推荐的答案对大家有所帮助,也希望大家多多支持IT屋!