通过Python中的列表元素进行csv内容过滤 [英] csv content filtering by list elments in Python
问题描述
我一心想从简单的Python代码中获得正确的结果(无论如何我都是Python初学者). 给定一个csv输入文件(ListInput.csv): pKT,pET,pUT,
I got stuck in getting the right result from a simple piece of Python code ( I am a Python beginner anyway). Given a csv input file (ListInput.csv): pKT, pET, pUT,
和另一个包含许多这些元素功能的csv文件(Table.csv):
and another csv file which contains features of many of these elements (Table.csv):
pBR,156,AATGGT,673,HHHTTTT,
pUT,54,CCATGTACCTAT,187,PRPTP,
pHTM,164,GGTATAG,971,WYT,
pKT,12,GCATACAGGAC,349,,
pET,87,GTGACGGTA,506,PPMK,
............等等
............ and so on
我旨在基于第一个csv文件元素进行选择,以获取csv文件作为输出(WorkingList.txt),在这种情况下,预期结果将是:
I aim to get a selection based on the first csv file elements in order to get a csv file as output (WorkingList.txt), in this case the expected result would be:
pKT,12,GCATACAGGAC,349,,
pET,87,GTGACGGTA,506,PPMK,
pUT,54,CCATGTACCTAT,187,PRPTP,
我编写了以下脚本,该脚本不会给出错误,但最终以一个空文件作为输出.我试着理解为什么自几天以来都没有成功.任何帮助都将不胜感激.
I wrote the following script which does not gives errors but end up with an empty file as output. I am tryng to understand why since a couple of days with no success. Any help is gratly appreciated.
#!/usr/bin/python
import csv
v = open('ListInput.csv', 'rt')
csv_v = csv.reader(v)
vt = open('Table.csv', 'rt')
csv_vt = csv.reader(vt)
with open("WorkingList.txt", "a+t") as myfile:
pass
for el in csv_v:
for var in csv_vt:
if el == var[0]:
myfile.write(var)
myfile.close()
推荐答案
第一个问题:
您在第一次迭代中消耗了输入的csv迭代器csv_vt
.您需要这样做:
You consume your input csv iterator csv_vt
at the first iteration. You need to do:
vt.seek(0)
后退该文件以进行内部循环.这留下了O(n^2)
搜索算法,但至少可以工作.
to rewind the file for the inner loop. This leave a O(n^2)
search algorithm but at least it works.
第二个问题:
您要打开&在with
块中关闭my_file
.当您到达for
循环时,my_file
已经关闭,因为您退出了with
块(这是with
块的保证).
you're opening & closing my_file
in the with
block. When you reach your for
loop, my_file
is already closed because you went out of the with
block (that's the guarantee of the with
block).
在尝试写入输出时,您是否遇到了第一个问题,即交叉路径带有对已关闭文件的操作".
Hadn't you have the first problem you'd had cross paths with "operation on closed file" when trying to write the output.
我将重写with
块中的最后一部分,并删除close()
.
I'd rewrite the last part within the with
block and remove the close()
.
第三个问题
您无法将列表写入文件,必须先创建csv.writer
对象.
you cannot write a list to a file, you have to create a csv.writer
object first.
总而言之,您可以使用以下代码解决所有问题以及性能问题:
So to sum it up, you could solve all problems plus the performance problem with the following code:
#!/usr/bin/python
import csv
v = open('ListInput.csv', 'rt')
csv_v = csv.reader(v)
with open('Table.csv', 'rt') as vt:
csv_vt = csv.reader(vt)
# create a dictionary to speed up lookup
# read the table only once
vdict = {var[0]:var for var in csv_vt}
with open("WorkingList.txt", newline="") as myfile: # for Python 3.x
## with open("WorkingList.txt", "wb") as myfile: # for Python 2
cw = csv.writer(myfile)
for el in csv_v:
if el[0] in vdict:
cw.writerow(vdict[el])
v.close()
vdict
是用于替换您的内部循环的查找表(仅在键"是唯一的情况下才有效,对于您的输入示例来说似乎是这种情况)
vdict
is the lookup table which replaces your inner loop (only works if the "keys" are unique, which seem to be the case given your input samples)
这篇关于通过Python中的列表元素进行csv内容过滤的文章就介绍到这了,希望我们推荐的答案对大家有所帮助,也希望大家多多支持IT屋!