通过公共列python合并两个CSV文件 [英] Merging two CSV files by a common column python

查看:145
本文介绍了通过公共列python合并两个CSV文件的处理方法,对大家解决问题具有一定的参考价值,需要的朋友们下面随着小编来一起学习吧!

问题描述

我正在尝试合并具有公共ID列的两个csv文件,并将合并写入新文件.我已经尝试了以下方法,但这给了我一个错误-

I am trying to merge two csv files with a common id column and write the merge to a new file. I have tried the following but it is giving me an error -

import csv
from collections import OrderedDict

filenames = "stops.csv", "stops2.csv"
data = OrderedDict()
fieldnames = []
for filename in filenames:
    with open(filename, "rb") as fp:  # python 2
        reader = csv.DictReader(fp)
        fieldnames.extend(reader.fieldnames)
        for row in reader:
            data.setdefault(row["stop_id"], {}).update(row)

fieldnames = list(OrderedDict.fromkeys(fieldnames))
with open("merged.csv", "wb") as fp:
    writer = csv.writer(fp)
    writer.writerow(fieldnames)
    for row in data.itervalues():
        writer.writerow([row.get(field, '') for field in fieldnames])

两个文件都有"stop_id"列,但我又收到了此错误- KeyError:"stop_id"

Both files have the "stop_id" column but I'm getting this error back - KeyError: 'stop_id'

任何帮助将不胜感激.

谢谢

推荐答案

感谢Shijo.

这对我有用-将其合并到每个csv中的第一列.

This is what worked for me after - merged by the first column in each csv.

import csv
from collections import OrderedDict

with open('stops.csv', 'rb') as f:
    r = csv.reader(f)
    dict2 = {row[0]: row[1:] for row in r}

with open('stops2.csv', 'rb') as f:
    r = csv.reader(f)
    dict1 = OrderedDict((row[0], row[1:]) for row in r)

result = OrderedDict()
for d in (dict1, dict2):
    for key, value in d.iteritems():
         result.setdefault(key, []).extend(value)

with open('ab_combined.csv', 'wb') as f:
    w = csv.writer(f)
    for key, value in result.iteritems():
        w.writerow([key] + value)

这篇关于通过公共列python合并两个CSV文件的文章就介绍到这了,希望我们推荐的答案对大家有所帮助,也希望大家多多支持IT屋!

查看全文
登录 关闭
扫码关注1秒登录
发送“验证码”获取 | 15天全站免登陆