在Python中按列表元素过滤csv内容 [英] csv content filtering by list elments in Python

查看:24
本文介绍了在Python中按列表元素过滤csv内容的处理方法,对大家解决问题具有一定的参考价值,需要的朋友们下面随着小编来一起学习吧!

问题描述

我无法从一段简单的 Python 代码中获得正确的结果(无论如何我是 Python 初学者).给定一个 csv 输入文件 (ListInput.csv):pKT、pET、pUT、

I got stuck in getting the right result from a simple piece of Python code ( I am a Python beginner anyway). Given a csv input file (ListInput.csv): pKT, pET, pUT,

和另一个包含许多这些元素特征的 csv 文件 (Table.csv):

and another csv file which contains features of many of these elements (Table.csv):

pBR,156,AATGGT,673,HHHTTTT,
pUT,54,CCATGTACCTAT,187,PRPTP,
pHTM,164,GGTATAG,971,WYT,
pKT,12,GCATACAGGAC,349,,
pET,87,GTGACGGTA,506,PPMK,

…………等等

我的目标是根据第一个 csv 文件元素进行选择,以便将 csv 文件作为输出 (WorkingList.txt),在这种情况下,预期结果是:

I aim to get a selection based on the first csv file elements in order to get a csv file as output (WorkingList.txt), in this case the expected result would be:

pKT,12,GCATACAGGAC,349,,
pET,87,GTGACGGTA,506,PPMK,
pUT,54,CCATGTACCTAT,187,PRPTP,

我编写了以下脚本,该脚本不会出错,但最终以空文件作为输出.我试图理解为什么几天以来没有成功.非常感谢任何帮助.

I wrote the following script which does not gives errors but end up with an empty file as output. I am tryng to understand why since a couple of days with no success. Any help is gratly appreciated.

#!/usr/bin/python
import csv

v = open('ListInput.csv', 'rt')
csv_v = csv.reader(v)

vt = open('Table.csv', 'rt')
csv_vt = csv.reader(vt)

with open("WorkingList.txt", "a+t") as myfile:
    pass


for el in csv_v:
    for var in csv_vt:
        if el == var[0]:
            myfile.write(var)

myfile.close()

推荐答案

第一个问题:

您在第一次迭代时使用输入的 csv 迭代器 csv_vt.你需要做的:

You consume your input csv iterator csv_vt at the first iteration. You need to do:

vt.seek(0)

为内循环倒带文件.这留下了 O(n^2) 搜索算法,但至少它有效.

to rewind the file for the inner loop. This leave a O(n^2) search algorithm but at least it works.

第二个问题:

你正在打开 &在 with 块中关闭 my_file.当您到达 for 循环时,my_file 已经关闭,因为您离开了 with 块(这是 with 的保证 块).

you're opening & closing my_file in the with block. When you reach your for loop, my_file is already closed because you went out of the with block (that's the guarantee of the with block).

如果您在尝试写入输出时遇到了对关闭文件的操作"的交叉路径,您会不会遇到第一个问题.

Hadn't you have the first problem you'd had cross paths with "operation on closed file" when trying to write the output.

我会重写 with 块中的最后一部分并删除 close().

I'd rewrite the last part within the with block and remove the close().

第三个问题

您不能将列表写入文件,您必须先创建一个 csv.writer 对象.

you cannot write a list to a file, you have to create a csv.writer object first.

所以总结一下,你可以用下面的代码解决所有问题加上性能问题:

So to sum it up, you could solve all problems plus the performance problem with the following code:

#!/usr/bin/python
import csv

v = open('ListInput.csv', 'rt')
csv_v = csv.reader(v)

with open('Table.csv', 'rt') as vt:
    csv_vt = csv.reader(vt)
    # create a dictionary to speed up lookup
    # read the table only once
    vdict = {var[0]:var for var in csv_vt}

with open("WorkingList.txt", newline="") as myfile:  # for Python 3.x
## with open("WorkingList.txt", "wb") as myfile:  # for Python 2
    cw = csv.writer(myfile)
    for el in csv_v:
        if el[0] in vdict:
            cw.writerow(vdict[el])

v.close()

vdict 是替换您的内部循环的查找表(仅当键"是唯一的时才有效,考虑到您的输入样本,情况似乎如此)

vdict is the lookup table which replaces your inner loop (only works if the "keys" are unique, which seem to be the case given your input samples)

这篇关于在Python中按列表元素过滤csv内容的文章就介绍到这了,希望我们推荐的答案对大家有所帮助,也希望大家多多支持IT屋!

查看全文
登录 关闭
扫码关注1秒登录
发送“验证码”获取 | 15天全站免登陆