如何在Python中以相反的顺序读取CSV文件? [英] How to read a CSV file in reverse order in Python?

查看:899
本文介绍了如何在Python中以相反的顺序读取CSV文件?的处理方法,对大家解决问题具有一定的参考价值,需要的朋友们下面随着小编来一起学习吧!

问题描述

我知道如何为TXT文件执行此操作,但是现在在处理CSV文件时遇到了一些麻烦.

I know how to do it for a TXT file, but now I am having some trouble doing it for a CSV file.

如何从Python底部读取CSV文件?

How can I read a CSV file from the bottom in Python?

推荐答案

与文本文件几乎相同的方式:将整个内容读入列表,然后倒退:

Pretty much the same way as for a text file: read the whole thing into a list and then go backwards:

import csv
with open('test.csv', 'r') as textfile:
    for row in reversed(list(csv.reader(textfile))):
        print ', '.join(row)

如果想花哨的话,可以编写很多代码,这些代码读取从文件末尾开始的块,并向后工作,一次发出一行,然后将其馈送到csv.reader,但这将仅适用于可搜索的文件,即磁盘文件,而不适用于标准输入.

If you want to get fancy, you could write a lot of code that reads blocks starting at the end of the file and working backwards, emitting a line at a time, and then feed that to csv.reader, but that will only work with a file that can be seeked, i.e. disk files but not standard input.

我们中有些人的文件无法容纳在内存中,有人可以提供不需要将整个文件存储在内存中的解决方案吗?

Some of us have files that do not fit into memory, could anyone come with a solution that does not require storing the entire file in memory?

有点棘手.幸运的是,所有csv.reader期望的都是类似迭代器的对象,该对象每次调用next()都会返回一个字符串(行).因此,我们抓住了达里乌斯·培根(Darius Bacon)在"在python中搜索文件的最后x行的最有效方法"来向后读取文件的行,而不必拉入整个文件:

That's a bit trickier. Luckily, all csv.reader expects is an iterator-like object that returns a string (line) per call to next(). So we grab the technique Darius Bacon presented in "Most efficient way to search the last x lines of a file in python" to read the lines of a file backwards, without having to pull in the whole file:

import os

def reversed_lines(file):
    "Generate the lines of file in reverse order."
    part = ''
    for block in reversed_blocks(file):
        for c in reversed(block):
            if c == '\n' and part:
                yield part[::-1]
                part = ''
            part += c
    if part: yield part[::-1]

def reversed_blocks(file, blocksize=4096):
    "Generate blocks of file's contents in reverse order."
    file.seek(0, os.SEEK_END)
    here = file.tell()
    while 0 < here:
        delta = min(blocksize, here)
        here -= delta
        file.seek(here, os.SEEK_SET)
        yield file.read(delta)

并将reversed_lines放入代码中,以反转之前的行,直到它们到达csv.reader,从而不再需要reversedlist:

and feed reversed_lines into the code to reverse the lines before they get to csv.reader, removing the need for reversed and list:

import csv
with open('test.csv', 'r') as textfile:
    for row in csv.reader(reversed_lines(textfile)):
        print ', '.join(row)

还有一种更Python化的解决方案,它不需要在内存中逐个字符地对字符进行逆转(提示:只需获取一个索引列表,其中该行中有行尾,就可以对其进行逆转,然后使用它来对块进行切片),然后使用itertools中的chain将连续块中的线簇粘合在一起,但这是读者的练习.

There is a more Pythonic solution possible, which doesn't require a character-by-character reversal of the block in memory (hint: just get a list of indices where there are line ends in the block, reverse it, and use it to slice the block), and uses chain out of itertools to glue the line clusters from successive blocks together, but that's left as an exercise for the reader.

值得注意的是,上面的reversed_lines()习惯用法仅在CSV文件中的列不包含换行符的情况下有效.

It's worth noting that the reversed_lines() idiom above only works if the columns in the CSV file don't contain newlines.

啊!总有东西.幸运的是,解决这个问题还不错:

Aargh! There's always something. Luckily, it's not too bad to fix this:

def reversed_lines(file):
    "Generate the lines of file in reverse order."
    part = ''
    quoting = False
    for block in reversed_blocks(file):
        for c in reversed(block):
            if c == '"':
                quoting = not quoting
            elif c == '\n' and part and not quoting:
                yield part[::-1]
                part = ''
            part += c
    if part: yield part[::-1]

当然,如果您的CSV方言不使用",则需要更改引号字符.

Of course, you'll need to change the quote character if your CSV dialect doesn't use ".

这篇关于如何在Python中以相反的顺序读取CSV文件?的文章就介绍到这了,希望我们推荐的答案对大家有所帮助,也希望大家多多支持IT屋!

查看全文
登录 关闭
扫码关注1秒登录
发送“验证码”获取 | 15天全站免登陆