如何在CSV文件的开头添加新列? [英] How to add a new column to the beginning of CSV file?

查看:410
本文介绍了如何在CSV文件的开头添加新列?的处理方法,对大家解决问题具有一定的参考价值,需要的朋友们下面随着小编来一起学习吧!

问题描述

我有一个csv文件,其中有6至8列.
例如:

 ID Test Description file-name module view path1 path2 
 

我想在开头添加新列(Node).
例如:

 Node ID Test Description file-name module view path1 path2 
 

解决方案

使用csv模块的DictReaderDictWriter类非常容易.这是一个读取旧文件并一次写入新文件的示例.

DictReader实例将文件的每一逻辑行或每一行作为字典返回,其关键字为字段名.您可以显式指定字段名称,也可以从文件的第一行中读取它们,如示例所示.

在创建DictWriter实例时,您必须指定所需的字段名称,并且字段名称的顺序定义了它们在输出文件的每一行中出现的顺序.在这种情况下,只需将新字段名称添加到输入文件名称列表的开头即可,无论它们是什么.

import csv

with open('testdata.txt', 'rb') as inf, open('testdata2.txt', 'wb') as outf:
    csvreader = csv.DictReader(inf)
    fieldnames = ['Node'] + csvreader.fieldnames  # add column name to beginning
    csvwriter = csv.DictWriter(outf, fieldnames)
    csvwriter.writeheader()
    for node, row in enumerate(csvreader, 1):
        csvwriter.writerow(dict(row, Node='node %s' % node))

如果这是输入文件的内容:

 ID,Test Description,file-name,module,view,path1,path2
id 1,test 1 desc,test1file.txt,test1module,N,test1path1,test1path2
id 2,test 2 desc,test2file.txt,test2module,Y,test2path1,test2path2
id 3,test 3 desc,test3file.txt,test3module,Y,test3path1,test3path2
id 4,test 4 desc,test4file.txt,test4module,N,test4path1,test4path2
id 5,test 5 desc,test5file.txt,test5module,Y,test5path1,test5path2
 

这将是运行脚本之后结果输出文件的内容:

 Node,ID,Test Description,file-name,module,view,path1,path2
node 1,id 1,test 1 desc,test1file.txt,test1module,N,test1path1,test1path2
node 2,id 2,test 2 desc,test2file.txt,test2module,Y,test2path1,test2path2
node 3,id 3,test 3 desc,test3file.txt,test3module,Y,test3path1,test3path2
node 4,id 4,test 4 desc,test4file.txt,test4module,N,test4path1,test4path2
node 5,id 5,test 5 desc,test5file.txt,test5module,Y,test5path1,test5path2
 

请注意,仅当字段名称是有效的关键字参数(即有效的Python标识符)时(例如Node),才使用dict(row, Node='node %s' % node)将字段的数据添加到如图所示的每一行中.

有效标识符仅由字母,数字和下划线组成,但不能以数字或下划线开头,并且不能为语言关键字,例如 class for return global pass print (在Python 2中)或 raise ./p>

要解决此限制,有必要单独进行操作:

    for node, row in enumerate(csvreader, 1):
        row['Invalid Keyword'] = 'node %s' % node  # add new field and value
        csvwriter.writerow(row)

I have one csv file in which I have 6 to 8 column.
Ex:

ID Test Description file-name module view path1 path2 

I want to add new column (Node) to the beginning.
Ex:

Node ID Test Description file-name module view path1 path2 

解决方案

It would be fairly easy to do using the csv module's DictReader and DictWriter classes. Here's an example that reads the old file and writes the new one in single pass.

A DictReader instance returns each logical line or row of the file as a dictionary whose keys are the field names. You can explicitly specify the field names or they can be read from the first line of the file, as shown in the example.

You must specify the desired field names when creating a DictWriter instance and the order of the field names defines the order they will appear on each line of the output file. In this case the new field name is simply added to beginning of the list of names from the input file — whatever they may be.

import csv

with open('testdata.txt', 'rb') as inf, open('testdata2.txt', 'wb') as outf:
    csvreader = csv.DictReader(inf)
    fieldnames = ['Node'] + csvreader.fieldnames  # add column name to beginning
    csvwriter = csv.DictWriter(outf, fieldnames)
    csvwriter.writeheader()
    for node, row in enumerate(csvreader, 1):
        csvwriter.writerow(dict(row, Node='node %s' % node))

If this was the contents of the input file:

ID,Test Description,file-name,module,view,path1,path2
id 1,test 1 desc,test1file.txt,test1module,N,test1path1,test1path2
id 2,test 2 desc,test2file.txt,test2module,Y,test2path1,test2path2
id 3,test 3 desc,test3file.txt,test3module,Y,test3path1,test3path2
id 4,test 4 desc,test4file.txt,test4module,N,test4path1,test4path2
id 5,test 5 desc,test5file.txt,test5module,Y,test5path1,test5path2

This would be the contents of the resulting output file after running the script:

Node,ID,Test Description,file-name,module,view,path1,path2
node 1,id 1,test 1 desc,test1file.txt,test1module,N,test1path1,test1path2
node 2,id 2,test 2 desc,test2file.txt,test2module,Y,test2path1,test2path2
node 3,id 3,test 3 desc,test3file.txt,test3module,Y,test3path1,test3path2
node 4,id 4,test 4 desc,test4file.txt,test4module,N,test4path1,test4path2
node 5,id 5,test 5 desc,test5file.txt,test5module,Y,test5path1,test5path2

Note that adding the data for a field to each row with dict(row, Node='node %s' % node) as shown only works when the field name is a valid keyword argument (i.e. valid Python identifier) — like Node.

Valid identifiers consist only of letters, digits, and underscores but not start with a digit or underscore, and cannot be language keyword such as class, for, return, global, pass, print (in Python 2), or raise.

To get around this limitation, it would be necessary to do it separately:

    for node, row in enumerate(csvreader, 1):
        row['Invalid Keyword'] = 'node %s' % node  # add new field and value
        csvwriter.writerow(row)

这篇关于如何在CSV文件的开头添加新列?的文章就介绍到这了,希望我们推荐的答案对大家有所帮助,也希望大家多多支持IT屋!

查看全文
登录 关闭
扫码关注1秒登录
发送“验证码”获取 | 15天全站免登陆