使用不同的行终止符在Python中读取csv文件 [英] Reading a csv file in Python with different line terminator

查看:49
本文介绍了使用不同的行终止符在Python中读取csv文件的处理方法,对大家解决问题具有一定的参考价值,需要的朋友们下面随着小编来一起学习吧!

问题描述

我有一个CSV格式的文件,其中分隔符是ASCII单位分隔符 ^ _ ,行终止符是ASCII记录分隔符 ^^ (显然,因为这些是非打印字符,我只是在这里使用了一种标准的书写方式).我已经写了很多可以读写CSV文件的代码,所以我的问题不是Python的csv模块本身.问题是csv模块不支持读取(但支持写),而不是回车符或换行符,至少从我刚测试过的Python 2.6开始.该文档说这是因为它是硬编码的,我要说这是在作为模块基础的C代码中完成的,因为我在csv.py文件中看不到任何可以更改的内容.

I have a file in CSV format where the delimiter is the ASCII unit separator ^_ and the line terminator is the ASCII record separator ^^ (obviously, since these are nonprinting characters, I've just used one of the standard ways of writing them here). I've written plenty of code that reads and writes CSV files, so my issue isn't with Python's csv module per se. The problem is that the csv module doesn't support reading (but it does support writing) line terminators other than a carriage return or line feed, at least as of Python 2.6 where I just tested it. The documentation says that this is because it's hard coded, which I take to mean it's done in the C code that underlies the module, since I didn't see anything in the csv.py file that I could change.

有人知道绕过此限制的方法(补丁,另一个CSV模块等)吗?我确实需要读取一个文件,在其中我不能使用回车符或换行符作为行终止符,因为这些字符将出现在某些字段中,并且我希望避免编写自己的自定义阅读器代码,即使那样很容易满足我的需求.

Does anyone know a way around this limitation (patch, another CSV module, etc.)? I really need to read in a file where I can't use carriage returns or new lines as the line terminator because those characters will appear in some of the fields, and I'd like to avoid writing my own custom reader code if possible, even though that would be rather simple to meet my needs.

推荐答案

为什么不向 csv.reader 函数提供可自定义的可迭代项?这是一个幼稚的实现,它可以立即将CSV文件的全部内容读入内存(根据文件的大小,可能需要也可能不需要):

Why not supply a custom iterable to the csv.reader function? Here is a naive implementation which reads the entire contents of the CSV file into memory at once (which may or may not be desirable, depending on the size of the file):

def records(path):
    with open(path) as f:
        contents = f.read()
        return (record for record in contents.split('^^'))

csv.reader(records('input.csv'))

我认为应该可以.

这篇关于使用不同的行终止符在Python中读取csv文件的文章就介绍到这了,希望我们推荐的答案对大家有所帮助,也希望大家多多支持IT屋!

查看全文
登录 关闭
扫码关注1秒登录
发送“验证码”获取 | 15天全站免登陆