如何在python中合并两个文件 [英] How to merge two files in python

查看:237
本文介绍了如何在python中合并两个文件的处理方法,对大家解决问题具有一定的参考价值,需要的朋友们下面随着小编来一起学习吧!

问题描述

我有两个标签分隔的csv文件(带头),我需要在python中合并。

I have two tab delimited csv files (with headers) that I need to merge in python.

此外,在合并文件中,我想添加一个列结束标识文件,因为虽然他们有相同的格式,他们有不同的数据,我需要分开以后。
所以,我想在每行输出上添加一个名为'source'的列,对于file1为0,对于file2为1.

Also, in the merged file I want to add a column in the end to identify the files because though they have same format, they have different data that I need to separate later on. So, I want to add a column called 'source' on each line of output which is 0 for file1 and 1 for file2.

作为使用csv模块,但writerow在它写入的每一行之间添加一个额外的换行符,这段代码不从file2写任何东西。我在这里做错了什么?此外,如何在对象中添加额外的列'source'?

I have gone far as using the csv module but the writerow adds an additioal newline character between each line it writes and this code doesn't write anything from file2. What am I doing wrong here? Also, how do I add the extra column 'source' in the line object?

import os, csv

path1 = os.path.abspath("../data/file1.txt")
path2 = os.path.abspath("../data/file2.txt")
merged_path = os.path.abspath('../data/output.txt')

# merge the two files for further processing
merged_file = csv.writer(open(merged_path, 'a'), delimiter = '\t')

#file1
fg = csv.reader(open(path1, 'r'), delimiter = '\t')

for line in fg:
    if line[7] != '\N':
        merged_file.writerow(line) 

#file2
bg = csv.reader(open(path2, 'r'), delimiter = '\t')

for line in bg:
    if line[16] != '\N':
        merged_file.writerow(line) 


推荐答案

我更喜欢使用dictWriter。此外,您的代码不工作,因为csv库需要打开 binary 模式下的文件。

I prefer to use the dictWriter for this. Also, your code doesn't work because the csv library requires opening files in binary mode.

import os, csv

path1 = os.path.abspath("../data/file1.txt")
path2 = os.path.abspath("../data/file2.txt")
merged_path = os.path.abspath('../data/output.txt')

#file1
fg = csv.DictReader(open(path1, 'rb'), delimiter = '\t')

fieldnames = fg.fieldnames
fieldnames.append('source')
# merge the two files for further processing
merged_file = csv.DictWriter(open(merged_path, 'ab'), delimiter = '\t', fieldnames=fieldnames)
merged_file.writeheader()

for row in fg:
    row['source'] = os.path.basename(path1)
    merged_file.writerow(row)

#file2
bg = csv.DictReader(open(path2, 'rb'), delimiter = '\t')

for row in bg:
    row['source'] = os.path.basename(path1)
    merged_file.writerow(row)

这篇关于如何在python中合并两个文件的文章就介绍到这了,希望我们推荐的答案对大家有所帮助,也希望大家多多支持IT屋!

查看全文
登录 关闭
扫码关注1秒登录
发送“验证码”获取 | 15天全站免登陆