如何将新列表添加到现有CSV文件? [英] How to append a new list to an existing CSV file?

查看:413
本文介绍了如何将新列表添加到现有CSV文件?的处理方法,对大家解决问题具有一定的参考价值,需要的朋友们下面随着小编来一起学习吧!

问题描述

我已经有一个使用CSV编写器从列表中创建的CSV文件.我想将通过for循环逐列创建的另一个列表追加到CSV文件.

I already have a CSV file created from a list using CSV writer. I want to append another list created through a for loop columnwise to a CSV file.

创建CSV文件的第一个代码如下:

The first code to create a CSV file is as follows:

with open("output.csv", "wb") as f:
    writer = csv.writer(f)
    for row in zip(master_lst):
        writer.writerow(row)

我使用列表master_lst创建了CSV文件,输出如下:

I created the CSV file using the list master_lst and the output is as follows:

read
ACACCUGGGCUCUCCGGGUACC
ACGGCUACCUUCACUGCCACCC
AGGCAGUGUGGUUAGCUGGUUG

然后,我通过for循环创建另一个列表(ind_lst),并且该列表的内容必须按列添加到上一步中创建的CSV文件中.我使用了以下代码:

Then I create another list (ind_lst) through a for loop and the contents of the list has to be appended columnwise to the CSV file created in the previous step. I used the following code:

with open("output.csv", "ab") as f:
    writer = csv.writer(f)
    for row in zip(ind_lst):
        writer.writerow(row)

我获得的输出如下:

read
ACACCUGGGCUCUCCGGGUACC
ACGGCUACCUUCACUGCCACCC
AGGCAGUGUGGUUAGCUGGUUG
sample1
3
3
1
sample2
4
4
1

但是我需要按列输出如下:

However I need the output columnwise as follows:

read                         sample1     sample2
ACACCUGGGCUCUCCGGGUACC         3            4
ACGGCUACCUUCACUGCCACCC         3            4
AGGCAGUGUGGUUAGCUGGUUG         1            1

我检查了解决方案,但只能找到按行追加的解决方案,但是我需要按列追加:

I checked for solutions but I can find only solutions for appending row wise, but I need to append it columnwise: append new row to old csv file python

我使用的是writer.writerows而不是writer.writerow,但出现此错误:

I used writer.writerows instead of writer.writerow but I get this error:

_csv.Error: sequence expected

输出如下:

read
ACACCUGGGCUCUCCGGGUACC
ACGGCUACCUUCACUGCCACCC
AGGCAGUGUGGUUAGCUGGUUG
s                        a   m   p  l  e 1

如您所见,它将在每个单元格中打印列表的第一个元素,然后以错误终止.我是python的初学者,所以如果有人可以帮助解决这个问题,那就太好了.

As you can see, it prints the first element of the list in each cell and terminates thereafter with an error. I am a beginner in python, so if anyone could help solve this issue that would be awesome.

master_lst使用以下代码创建:

The master_lst is created using the following code:

 infile= open(sys.argv[1], "r")
 lines = infile.readlines()[1:]
 master_lst = ["read"]
 for line in lines:
  line= line.strip().split(',')
  fourth_field = line [3]
  master_lst.append(fourth_field)

ind_lst是使用以下代码创建的:

the ind_lst is created using the following code:

for file in files:
 ind_lst = []   
 if file.endswith('.fa'):
  first = file.split(".")
  first_field = first [0]
  ind_lst.append(first_field)
  fasta= open(file)
  individual_dict= {}
  for line in fasta:
   line= line.strip()
   if line == '':
    continue
   if line.startswith('>'):
    header = line.lstrip('>')
    individual_dict[header]= ''
   else:
    individual_dict[header] += line
 for i in master_lst[1:]:
   a = 0
   if key in individual_dict.keys():
     a = individual_dict[key]
   else:
    a = 0
   ind_lst.append(a)

推荐答案

实际上,即使这些新列的数据都存储在单个列表中,您实际上也试图将几列追加到现有文件中.最好以不同的方式在ind_lst中安排数据.但是由于您尚未展示如何完成此操作,因此下面的代码可与您问题中的格式配合使用.

You're actually trying to append several columns to the existing file, even if the data for these new columns is all stored in a single list. It might be better to arrange the data in the ind_lst differently. but since you haven't showed how that's done, the code below works with the format in your question.

由于修改CSV文件非常麻烦(因为它们实际上只是文本文件),因此简单得多简单地使用合并后的数据创建一个新文件,然后对该文件进行重命名以匹配 删除原始文件 之后的原始文件(现在已被警告).

Since modifying CSV files is tricky—since they're really just text file—it would be much easier to simply create a new file with the merged data, and then rename that file to match the original after deleting the original (you've now been warned).

import csv
from itertools import izip  # Python 2
import os
import tempfile

master_lst = [
    'read',
    'ACACCUGGGCUCUCCGGGUACC',
    'ACGGCUACCUUCACUGCCACCC',
    'AGGCAGUGUGGUUAGCUGGUUG'
]

ind_lst = [
    'sample1',
    '3',
    '3',
    '1',
    'sample2',
    '4',
    '4',
    '1'
]

csv_filename = 'output.csv'

def grouper(n, iterable):
    's -> (s0,s1,...sn-1), (sn,sn+1,...s2n-1), (s2n,s2n+1,...s3n-1), ...'
    return izip(*[iter(iterable)]*n)

# first create file to update
with open(csv_filename, 'wb') as f:
    writer = csv.writer(f)
    writer.writerows(((row,) for row in master_lst))

# Rearrange ind_lst so it's a list of pairs of values.
# The number of resulting pairs should be equal to length of the master_lst.
# Result for example data:  [('sample1', 'sample2'), ('3', '4'), ('3', '4'), ('1', '1')]
new_cols = (zip(*grouper(len(master_lst), ind_lst)))
assert len(new_cols) == len(master_lst)

with open(csv_filename, 'rb') as fin, tempfile.NamedTemporaryFile('r+b') as temp_file:
    reader = csv.reader(fin)
    writer = csv.writer(temp_file)
    nc = iter(new_cols)
    for row in reader:
        row.extend(next(nc))  # add new columns to each row
        writer.writerow(row)
    else:  # for loop completed, replace original file with temp file
        fin.close()
        os.remove(csv_filename)
        temp_file.flush()  # flush the internal file buffer
        os.fsync(temp_file.fileno())  # force writing of all data in temp file to disk
        os.rename(temp_file.name, csv_filename)

print('done')

创建后文件的内容,然后进行更新:

Contents of file after creation followed by update:

read,sample1,sample2
ACACCUGGGCUCUCCGGGUACC,3,4
ACGGCUACCUUCACUGCCACCC,3,4
AGGCAGUGUGGUUAGCUGGUUG,1,1

这篇关于如何将新列表添加到现有CSV文件?的文章就介绍到这了,希望我们推荐的答案对大家有所帮助,也希望大家多多支持IT屋!

查看全文
登录 关闭
扫码关注1秒登录
发送“验证码”获取 | 15天全站免登陆