通过Python中的列表元素进行csv内容过滤 [英] csv content filtering by list elments in Python

查看:179
本文介绍了通过Python中的列表元素进行csv内容过滤的处理方法,对大家解决问题具有一定的参考价值,需要的朋友们下面随着小编来一起学习吧!

问题描述

我一心想从简单的Python代码中获得正确的结果(无论如何我都是Python初学者). 给定一个csv输入文件(ListInput.csv): pKT,pET,pUT,

I got stuck in getting the right result from a simple piece of Python code ( I am a Python beginner anyway). Given a csv input file (ListInput.csv): pKT, pET, pUT,

和另一个包含许多这些元素功能的csv文件(Table.csv):

and another csv file which contains features of many of these elements (Table.csv):

pBR,156,AATGGT,673,HHHTTTT,
pUT,54,CCATGTACCTAT,187,PRPTP,
pHTM,164,GGTATAG,971,WYT,
pKT,12,GCATACAGGAC,349,,
pET,87,GTGACGGTA,506,PPMK,

............等等

............ and so on

我旨在基于第一个csv文件元素进行选择,以获取csv文件作为输出(WorkingList.txt),在这种情况下,预期结果将是:

I aim to get a selection based on the first csv file elements in order to get a csv file as output (WorkingList.txt), in this case the expected result would be:

pKT,12,GCATACAGGAC,349,,
pET,87,GTGACGGTA,506,PPMK,
pUT,54,CCATGTACCTAT,187,PRPTP,

我编写了以下脚本,该脚本不会给出错误,但最终以一个空文件作为输出.我试着理解为什么自几天以来都没有成功.任何帮助都将不胜感激.

I wrote the following script which does not gives errors but end up with an empty file as output. I am tryng to understand why since a couple of days with no success. Any help is gratly appreciated.

#!/usr/bin/python
import csv

v = open('ListInput.csv', 'rt')
csv_v = csv.reader(v)

vt = open('Table.csv', 'rt')
csv_vt = csv.reader(vt)

with open("WorkingList.txt", "a+t") as myfile:
    pass


for el in csv_v:
    for var in csv_vt:
        if el == var[0]:
            myfile.write(var)

myfile.close()

推荐答案

第一个问题:

您在第一次迭代中消耗了输入的csv迭代器csv_vt.您需要这样做:

You consume your input csv iterator csv_vt at the first iteration. You need to do:

vt.seek(0)

后退该文件以进行内部循环.这留下了O(n^2)搜索算法,但至少可以工作.

to rewind the file for the inner loop. This leave a O(n^2) search algorithm but at least it works.

第二个问题:

您要打开&在with块中关闭my_file.当您到达for循环时,my_file已经关闭,因为您退出了with块(这是with块的保证).

you're opening & closing my_file in the with block. When you reach your for loop, my_file is already closed because you went out of the with block (that's the guarantee of the with block).

在尝试写入输出时,您是否遇到了第一个问题,即交叉路径带有对已关闭文件的操作".

Hadn't you have the first problem you'd had cross paths with "operation on closed file" when trying to write the output.

我将重写with块中的最后一部分,并删除close().

I'd rewrite the last part within the with block and remove the close().

第三个问题

您无法将列表写入文件,必须先创建csv.writer对象.

you cannot write a list to a file, you have to create a csv.writer object first.

总而言之,您可以使用以下代码解决所有问题以及性能问题:

So to sum it up, you could solve all problems plus the performance problem with the following code:

#!/usr/bin/python
import csv

v = open('ListInput.csv', 'rt')
csv_v = csv.reader(v)

with open('Table.csv', 'rt') as vt:
    csv_vt = csv.reader(vt)
    # create a dictionary to speed up lookup
    # read the table only once
    vdict = {var[0]:var for var in csv_vt}

with open("WorkingList.txt", newline="") as myfile:  # for Python 3.x
## with open("WorkingList.txt", "wb") as myfile:  # for Python 2
    cw = csv.writer(myfile)
    for el in csv_v:
        if el[0] in vdict:
            cw.writerow(vdict[el])

v.close()

vdict是用于替换您的内部循环的查找表(仅在键"是唯一的情况下才有效,对于您的输入示例来说似乎是这种情况)

vdict is the lookup table which replaces your inner loop (only works if the "keys" are unique, which seem to be the case given your input samples)

这篇关于通过Python中的列表元素进行csv内容过滤的文章就介绍到这了,希望我们推荐的答案对大家有所帮助,也希望大家多多支持IT屋!

查看全文
登录 关闭
扫码关注1秒登录
发送“验证码”获取 | 15天全站免登陆