csv阅读器行为具有None和空字符串 [英] csv reader behavior with None and empty string
问题描述
我要区分无
和空字符串,当在Python数据结构和csv表示之间来回使用Python的 csv
模块。
I'd like to distinguishing None
and empty strings when going back and forth between Python data structure and csv representation using Python's csv
module.
我的问题是,当我运行:
My issue is that when I run:
import csv, cStringIO
data = [['NULL/None value',None],
['empty string','']]
f = cStringIO.StringIO()
csv.writer(f).writerows(data)
f = cStringIO.StringIO(f.getvalue())
data2 = [e for e in csv.reader(f)]
print "input : ", data
print "output: ", data2
我得到以下输出:
input : [['NULL/None value', None], ['empty string', '']]
output: [['NULL/None value', ''], ['empty string', '']]
当然,我可以使用 data
和 data2
以区分无
和空字符串:
Of course, I could play with data
and data2
to distinguish None
and empty strings with things like:
data = [d if d!=None else 'None' for d in data]
data2 = [d if d!='None' else None for d in data2]
但这会部分地违反我对 csv
模块(在C中实现快速反序列化/序列化,特别是在处理大型列表时)。
But that would partly defeat my interest of the csv
module (quick deserialization/serialization implemented in C, specially when you are dealing with large lists).
csv.Driect 或参数到 csv.writer
和 csv.reader
允许他们区分此用例中的''
和无
?
Is there a csv.Dialect
or parameters to csv.writer
and csv.reader
that would enable them to distinguish between ''
and None
in this use-case?
如果没有,是否有兴趣实现 csv.writer
的补丁以启用这种来回? (可能 Dialect.None_translate_to
参数默认为''
以确保向后兼容性)
If not, would there be an interest in implementing a patch to csv.writer
to enable this kind of back and forth? (Possibly a Dialect.None_translate_to
parameter defaulting to ''
to ensure backward compatibility)
推荐答案
文档表示您不想要的内容:
The documentation suggests that what you want is not possible:
尽可能方便地与实现数据库的模块API,值None被写为空字符串。
To make it as easy as possible to interface with modules which implement the DB API, the value None is written as the empty string.
这是在 / code>类,表明它是真正的所有方言,是csv模块的内在限制。
This is in the documentation for the writer
class, suggesting it is true for all dialects and is an intrinsic limitation of the csv module.
我将支持更改这csv模块的各种其他限制),但可能是人们想要将这种工作卸载到不同的库,并保持CSV模块简单(或至少是那么简单)。
I for one would support changing this (along with various other limitations of the csv module), but it may be that people would want to offload this sort of work into a different library, and keep the CSV module simple (or at least as simple as it is).
如果你需要更强大的文件读取功能,你可能需要查看numpy,scipy和pandas中的CSV读取函数,我记得有更多的选项。
If you need more powerful file-reading capabilities, you might want to look at the CSV reading functions in numpy, scipy, and pandas, which as I recall have more options.
这篇关于csv阅读器行为具有None和空字符串的文章就介绍到这了,希望我们推荐的答案对大家有所帮助,也希望大家多多支持IT屋!