Python加入csv文件,其中键是第一列值 [英] Python Joining csv files where key is first column value

查看:502
本文介绍了Python加入csv文件,其中键是第一列值的处理方法,对大家解决问题具有一定的参考价值,需要的朋友们下面随着小编来一起学习吧!

问题描述

我尝试加入两个csv文件,其中键是第一列的值。
没有标题。

文件具有不同的行数和行数。

文件a的顺序必须保留。

I try to join two csv files where key is value of first column. There's no header.
Files have different number of lines and rows.
Order of file a must be preserved.

档案a:

john,red,34
andrew,green,18
tonny,black,50
jack,yellow,27
phill,orange,45
kurt,blue,29
mike,pink,61

文件b:

tonny,driver,new york
phill,scientist,boston

所需结果:

john,red,34
andrew,green,18
tonny,black,50,driver,new york
jack,yellow,27
phill,orange,45,scientist,boston
kurt,blue,29
mike,pink,61

我检查了所有相关的线程,我相信有些人会标记这个问题重复,但我根本没有找到解决方案。

I examined all related threads and I am sure that some of you are gonna mark this question duplicate but I simply have not found solution yet.

我获取了基于字典的解决方案,但是这种方法不能处理文件'a'条件下保存行顺序。

I grabbed dictionary based solution but this approach does not handle preserve line order from file 'a' condition.

import csv
from collections import defaultdict

with open('a.csv') as f:
    r = csv.reader(f, delimiter=',')
    dict1 = {}
    for row in r:
        dict1.update({row[0]: row[1:]})

with open('b.csv') as f:
    r = csv.reader(f, delimiter=',')
    dict2 = {}
    for row in r:
        dict2.update({row[0]: row[1:]})

result = defaultdict(list)

for d in (dict1, dict2):
    for key, value in d.iteritems():
        result[key].append(value)

我也想避免把这些csv文件到数据库像sqlite或使用pandas模块。

I also would like to avoid putting these csv files to the database like sqlite or using pandas module.

先感谢

推荐答案

类似

import csv
from collections import OrderedDict

with open('b.csv', 'rb') as f:
    r = csv.reader(f)
    dict2 = {row[0]: row[1:] for row in r}

with open('a.csv', 'rb') as f:
    r = csv.reader(f)
    dict1 = OrderedDict((row[0], row[1:]) for row in r)

result = OrderedDict()
for d in (dict1, dict2):
    for key, value in d.iteritems():
        result.setdefault(key, []).extend(value)

with open('ab_combined.csv', 'wb') as f:
    w = csv.writer(f)
    for key, value in result.iteritems():
        w.writerow([key] + value)

产生

john,red,34
andrew,green,18
tonny,black,50,driver,new york
jack,yellow,27
phill,orange,45,scientist,boston
kurt,blue,29
mike,pink,61

(注意,我没有打扰 dict2 具有不在 dict1 中的键的情况 - 如果您喜欢,很容易添加。)

(Note that I didn't bother protecting against the case where dict2 has a key which isn't in dict1-- that's easily added if you like.)

这篇关于Python加入csv文件,其中键是第一列值的文章就介绍到这了,希望我们推荐的答案对大家有所帮助,也希望大家多多支持IT屋!

查看全文
登录 关闭
扫码关注1秒登录
发送“验证码”获取 | 15天全站免登陆