在更新文件上使用Python csv模块 [英] Using Python csv module on updating file

查看:79
本文介绍了在更新文件上使用Python csv模块的处理方法,对大家解决问题具有一定的参考价值,需要的朋友们下面随着小编来一起学习吧!

问题描述

我正在使用python的csv模块从持续由外部工具更新的csv中提取数据.我遇到一个问题,当我到达文件末尾时,出现StopIteration错误,但是,我希望脚本继续循环运行,以等待外部工具添加更多行.

I am using python's csv module to extract data from a csv that is constantly being updated by an external tool. I have run into a problem where when I reach the end of the file I get a StopIteration error, however, I would like the script to continue to loop waiting for more lines to be added by the external tool.

到目前为止,我想到的是:

What I came up with so far to do this is:

f = open('file.csv')
csvReader = csv.reader(f, delimiter=',')
while 1:
    try:
        doStuff(csvReader.next())
    except StopIteration:
        depth = f.tell()
        f.close()
        f = open('file.csv')
        f.seek(depth)
        csvReader = csv.reader(f, delimiter=',')

这具有预期的功能,但似乎也很糟糕.捕获StopIteration后无法进行循环,因为一旦抛出StopIteration,它将在随后的对next()的每次调用中引发StopIteration.任何人都对如何实现此建议有任何建议,而我不必愚蠢地讲述和寻求?或者拥有其他可以轻松支持此功能的python模块.

This has the intended functionality but it also seems terrible. Looping after catching the StopIteration is not possible since once StopIteration is thrown, it will throw a StopIteration on every subsequent call to next(). Anyone have any suggestions on how to implement this is in such a way that I don't have to do this silly tell and seeking? Or have a different python module that can easily support this functionality.

推荐答案

您的问题不在于CSV阅读器,而在于文件对象本身.您可能仍然需要在上面的代码段中进行疯狂的旋转,但是最好创建一个文件对象包装器或子类来为您完成此操作,并将其与CSV阅读器一起使用.这样可以使复杂性与您的csv处理代码隔离开来.

Your problem is not with the CSV reader, but with the file object itself. You may still have to do the crazy gyrations you're doing in your snippet above, but it would be better to create a file object wrapper or subclass that does it for you, and use that with your CSV reader. That keeps the complexity isolated from your csv processing code.

例如(警告:未经测试的代码):

For instance (warning: untested code):

class ReopeningFile(object):
    def __init__(self, filename):
        self.filename = filename
        self.f = open(self.filename)

    def next(self):
        try:
            self.f.next()
        except StopIteration:
            depth = self.f.tell()
            self.f.close()
            self.f = open(self.filename)
            self.f.seek(depth)
            # May need to sleep here to allow more data to come in
            # Also may need a way to signal a real StopIteration
            self.next()

    def __iter__(self):
        return self

然后,您的主要代码变得更加简单,因为它无需管理文件重新打开的操作(请注意,文件重新启动时,您也不必重新启动csv_reader:

Then your main code becomes simpler, as it is freed from having to manage the file reopening (note that you also don't have to restart your csv_reader whenever the file restarts:

import csv
csv_reader = csv.reader(ReopeningFile('data.csv'))
for each in csv_reader:
    process_csv_line(each)

这篇关于在更新文件上使用Python csv模块的文章就介绍到这了,希望我们推荐的答案对大家有所帮助,也希望大家多多支持IT屋!

查看全文
登录 关闭
扫码关注1秒登录
发送“验证码”获取 | 15天全站免登陆