按列附加两个 CSV 文件 [英] Appending two CSV files column-wise

查看:32
本文介绍了按列附加两个 CSV 文件的处理方法,对大家解决问题具有一定的参考价值,需要的朋友们下面随着小编来一起学习吧!

问题描述

假设我在 Python 中有两个名为 AB 的 CSV 文件.

Suppose I have two CSV files called A and B in Python.

Ahead 看起来像:

 headerNameA1,headerNameA2
 1.12412424,1
 1,1
 1,1
 1,1

Bhead 看起来像:

 headerNameB1,headerNameB2
 1,1
 1,1
 1,1
 1,1

我的目标是将 B 附加到 A 上,这样 A 就会看起来像:

My objective is to take B and append it onto A so that A will then look like:

 headerNameA1,headerNameA2,headerNameB1,headerNameB2
 1,1,1.12412424,1
 1,1,1,1
 1,1,1,1
 1,1,1,1

从我问的另一个问题来看,这里的代码将采用 AB 并将它们组合成一个 C:

From another question I asked, here's code that will take A and B and combine them into a C:

 import csv
 with open('A','rb') as f1, open('B','rb') as f2, open('out.csv','wb') as w:
     writer = csv.writer(w)
     r1,r2 = csv.reader(f1),csv.reader(f2)
     while True:
         try:
             writer.writerow(next(r1)+next(r2))
         except StopIteration:
             break

不过,这道题的目的只是在A后面加上B.

However, the objective of this question is just to add B onto the back of A.

如果A 的大小使得在删除 之前将其复制为文件C 的磁盘空间太昂贵,则这是必要的A 之后.

This would be necessary if the size of A is such that it is too expensive to disk space to make a copy of it as file C before deleting A afterwards.

通过 os.system 调用的 bash 解决方案是可以接受的

A bash solution called through os.system is acceptable

推荐答案

您或许可以使用命名管道.您有一个 Python 进程运行,它创建一个管道并以写入模式打开它.然后它输出到 CSV 文件的列明智连接(类似于你所得到的)......当另一个进程开始读取该文件时,它将能够使用数据,但实际上没有存储任何文件在服务器上,它只是按需.当文件"被消耗时,其中将没有任何内容,并且任何访问它的尝试都将阻塞,直到另一个进程写入另一端.

You might be able to get away with a named pipe. You have a Python process run which creates a pipe and opens it in write mode. It then outputs to that the column wise concatenation of the CSV files (similar to what you've got) already... When another process starts reading that file, it'll be able to consume the data, but no file is actually stored on the server, it's just on demand. When the "file" is consumed, then there'll be nothing in it, and any attempt to access it will block until another process writes to the other end.

一些虚拟代码 - 需要更深思熟虑的异常处理等...:

Some dummy code - will need more thought out exception handling etc...:

import os
from itertools import izip

a = 'abcdef' # File A's rows
b = 'ghijkl' # File B's rows

outname = 'joined'

try:
    os.unlink(outname)
    os.mkfifo(outname)
except OSError:
    pass

with open(outname, 'w') as fout:
    for items in izip(a, b):
        fout.write(''.join(items) + '\n') # Do "real" write here instead...
    os.unlink(outname)

其他东西以读取模式打开该文件"并使用它来检索数据.除非该进程必须具有物理文件",否则这应该有效......

Something else opens that "file" in read mode and consumes it to retrieve the data. This should work unless that process has to have "physical files"...

这篇关于按列附加两个 CSV 文件的文章就介绍到这了,希望我们推荐的答案对大家有所帮助,也希望大家多多支持IT屋!

查看全文
登录 关闭
扫码关注1秒登录
发送“验证码”获取 | 15天全站免登陆