使用pandas/python合并/合并两个csv [英] combine/merge two csv using pandas/python

查看:60
本文介绍了使用pandas/python合并/合并两个csv的处理方法,对大家解决问题具有一定的参考价值,需要的朋友们下面随着小编来一起学习吧!

问题描述

我有两个csv,我想将这些csvs合并或合并为左连接...我的键列是"id",我在两个csvs中都具有与"result"相同的非键列,但是如果第二CSV的"result"列中存在任何值,我想覆盖"result"列.如何使用熊猫或任何脚本语言实现这一目标.请查看我的最终预期输出.

I have two csvs, I want to combine or merge these csvs as left join... my key column is "id", I have same non-key column as "result" in both csvs, but I want to override "result" column if any value exists in "result" column of 2nd CSV . How can I achieve that using pandas or any scripting lang. Please see my final expected output.

input.csv:

id,scenario,data1,data2,result
1,s1,300,400,"{s1,not added}"
2,s2,500,101,"{s2 added}"
3,s3,600,202,

output.csv:

id,result
1,"{s1,added}"
3,"{s3,added}"

预期产量

final_output.csv

id,scenario,data1,data2,result
1,s1,300,400,"{s1,added}"
2,s2,500,101,"{s2 added}"
3,s3,600,202,"{s3,added}"

当前代码:

import pandas as pd

a = pd.read_csv("input.csv")
b = pd.read_csv("output.csv")
merged = a.merge(b, on='test_id',how='left')
merged.to_csv("final_output.csv", index=False)

问题:

使用此代码,我将两次获得结果列.我只想要一次,并且如果该列中存在值,它应该覆盖.如何获得一个结果列?

Question:

Using this code I am getting the result column twice. I want only once and it should override if value exists in that column. How do I get a single result column?

推荐答案

尝试一下,效果很好

import pandas as pd
import numpy as np
c=pd.merge(a,b,on='id',how='left')
lst=[]
for i in c.index:
    if(c.iloc[i]['result_x']!=''):
         lst.append(c.iloc[i]['result_x'])
    else:
         lst.append(c.iloc[i]['result_y'])
c['result']=pd.Series(lst)
del c['result_x']
del c['result_y']

这篇关于使用pandas/python合并/合并两个csv的文章就介绍到这了,希望我们推荐的答案对大家有所帮助,也希望大家多多支持IT屋!

查看全文
登录 关闭
扫码关注1秒登录
发送“验证码”获取 | 15天全站免登陆