比较两个csv文件和python pandas [英] Compare two csv files with python pandas

查看:147
本文介绍了比较两个csv文件和python pandas的处理方法,对大家解决问题具有一定的参考价值,需要的朋友们下面随着小编来一起学习吧!

问题描述

我有两个csv文件,都由两列组成.

I have two csv files both consist of two columns.

第一个具有产品ID,第二个具有序列号.

The first one has the product id, and the second has the serial number.

我需要从第一个csv查找所有序列号,并在第二个csv上找到匹配项.结果报告将在单独的列中具有匹配的序列号以及每个csv的相应产品ID 我确实修改了以下代码,没有运气.

I need to lookup, all serial numbers from the first csv, and find matches, on the second csv. The result report, will have the serial number that matched, and the corresponding product ids from each csv, in a separate column i trued to modify the below code, no luck.

您将如何处理?

import pandas as pd
    A=set(pd.read_csv("c1.csv", index_col=False, header=None)[0]) #reads the csv, takes only the first column and creates a set out of it.
    B=set(pd.read_csv("c2.csv", index_col=False, header=None)[0]) #same here
    print(A-B) #set A - set B gives back everything thats only in A.
    print(B-A) # same here, other way around.

推荐答案

我认为您需要 merge :

I think you need merge:

A = pd.DataFrame({'product id':   [1455,5452,3775],
                    'serial number':[44,55,66]})

print (A)

B = pd.DataFrame({'product id':   [7000,2000,1000],
                    'serial number':[44,55,77]})

print (B)

print (pd.merge(A, B, on='serial number'))
   product id_x  serial number  product id_y
0          1455             44          7000
1          5452             55          2000

这篇关于比较两个csv文件和python pandas的文章就介绍到这了,希望我们推荐的答案对大家有所帮助,也希望大家多多支持IT屋!

查看全文
登录 关闭
扫码关注1秒登录
发送“验证码”获取 | 15天全站免登陆