如何使用python的tail -f方式读取csv文件? [英] How to read a csv file in tail -f manner using python?

查看:169
本文介绍了如何使用python的tail -f方式读取csv文件?的处理方法,对大家解决问题具有一定的参考价值,需要的朋友们下面随着小编来一起学习吧!

问题描述

我想以类似于tail -f的方式读取csv文件,即读取错误日志文件.

I want to read the csv file in a manner similar to tail -f i.e. like reading an error log file.

我可以使用以下代码在文本文件中执行此操作:

I can perform this operation in a text file with this code:

 while 1:
      where = self.file.tell()
      line = self.file.readline()
      if not line:
        print "No line waiting, waiting for one second"
        time.sleep(1)
        self.file.seek(where)
      if (re.search('[a-zA-Z]', line) == False):
        continue
      else:
        response = self.naturalLanguageProcessing(line)
        if(response is not None):
          response["id"] = self.id
          self.id += 1
          response["tweet"] = line
          self.saveResults(response)
        else:
          continue

如何为csv文件执行相同的任务?我已经通过了一个链接,该链接可以给我最后8行,但这不是我所需要的. csv文件将同时更新,我需要获取新添加的行.

How do I perform the same task for a csv file? I have gone through a link which can give me last 8 rows but that is not what I require. The csv file will be getting updated simultaneously and I need to get the newly appended rows.

推荐答案

将文件剪裁器连接到csv.reader

为了将查找新添加到文件的内容的代码插入到csv.reader中,您需要将其放入迭代器的形式.

Connecting A File Tailer To A csv.reader

In order to plug your code that looks for content newly appended to a file into a csv.reader, you need to put it into the form of an iterator.

我并不是要展示正确的代码,而是要展示如何将您现有的代码用于这种形式,而无需断言其正确性. ,最好用诸如inotify之类的机制代替sleep(),以使操作系统在文件更改时断言地通知您;并且seek()tell()最好将部分行存储在内存中,而不是从头开始一遍又一遍地备份和重新读取它们.

I'm not intending to showcase correct code, but specifically to show how to adopt your existing code into this form, without making assertions about its correctness. In particular, the sleep() would be better replaced with a mechanism such as inotify to let the operating system assertively inform you when the file has changed; and the seek() and tell() would be better replaced with storing partial lines in memory rather than backing up and rereading them from the beginning over and over.

import csv
import time

class FileTailer(object):
    def __init__(self, file, delay=0.1):
        self.file = file
        self.delay = delay
    def __iter__(self):
        while True:
            where = self.file.tell()
            line = self.file.readline()
            if line and line.endswith('\n'): # only emit full lines
                yield line
            else:                            # for a partial line, pause and back up
                time.sleep(self.delay)       # ...not actually a recommended approach.
                self.file.seek(where)

csv_reader = csv.reader(FileTailer(open('myfile.csv')))
for row in csv_reader:
    print("Read row: %r" % (row,))

如果创建一个空的myfile.csv,启动python csvtailer.py,然后从另一个窗口启动echo "first,line" >>myfile.csv,您将立即看到Read row: ['first', 'line']的输出.

If you create an empty myfile.csv, start python csvtailer.py, and then echo "first,line" >>myfile.csv from a different window, you'll see the output of Read row: ['first', 'line'] immediately appear.

对于等待新行可用的正确实现的迭代器,请考虑参考有关该主题的现有StackOverflow问题之一:

For a correctly-implemented iterator that waits for new lines to be available, consider referring to one of the existing StackOverflow questions on the topic:

  • How to implement a pythonic equivalent of tail -F?
  • Reading infinite stream - tail
  • Reading updated files on the fly in Python

这篇关于如何使用python的tail -f方式读取csv文件?的文章就介绍到这了,希望我们推荐的答案对大家有所帮助,也希望大家多多支持IT屋!

查看全文
登录 关闭
扫码关注1秒登录
发送“验证码”获取 | 15天全站免登陆