没有和空字符串的 CSV 阅读器行为 [英] CSV reader behavior with None and empty string

查看:22
本文介绍了没有和空字符串的 CSV 阅读器行为的处理方法,对大家解决问题具有一定的参考价值,需要的朋友们下面随着小编来一起学习吧!

问题描述

在使用 Python 的 csv 在 Python 数据结构和 csv 表示之间来回切换时,我想区分 None 和空字符串 ('') 模块.

I'd like to distinguish between None and empty strings ('') when going back and forth between Python data structure and csv representation using Python's csv module.

我的问题是当我跑步时:

My issue is that when I run:

import csv, cStringIO

data = [['NULL/None value',None],
        ['empty string','']]

f = cStringIO.StringIO()
csv.writer(f).writerows(data)

f = cStringIO.StringIO(f.getvalue())
data2 = [e for e in csv.reader(f)]

print "input : ", data
print "output: ", data2

我得到以下输出:

input :  [['NULL/None value', None], ['empty string', '']]
output:  [['NULL/None value', ''], ['empty string', '']]

当然,我可以使用 datadata2 来区分 None 和空字符串,例如:

Of course, I could play with data and data2 to distinguish None and empty strings with things like:

data = [d if d!=None else 'None' for d in data]
data2 = [d if d!='None' else None for d in data2]

但这会在一定程度上挫败我对 csv 模块(用 C 实现的快速反序列化/序列化,特别是在处理大型列表时)的兴趣.

But that would partly defeat my interest of the csv module (quick deserialization/serialization implemented in C, specially when you are dealing with large lists).

是否有 csv.Dialectcsv.writercsv.reader 的参数使它们能够区分 ''None 在这个用例中?

Is there a csv.Dialect or parameters to csv.writer and csv.reader that would enable them to distinguish between '' and None in this use-case?

如果没有,是否有兴趣为 csv.writer 实施补丁以启用这种来回?(可能是 Dialect.None_translate_to 参数默认为 '' 以确保向后兼容.)

If not, would there be an interest in implementing a patch to csv.writer to enable this kind of back and forth? (Possibly a Dialect.None_translate_to parameter defaulting to '' to ensure backward compatibility.)

推荐答案

文档 表示你想要的东西是不可能的:

The documentation suggests that what you want is not possible:

为了尽可能轻松地与实现 DB API 的模块交互,值 None 被写入为空字符串.

To make it as easy as possible to interface with modules which implement the DB API, the value None is written as the empty string.

这在 writer 类的文档中,表明它适用于所有方言,并且是 csv 模块的内在限制.

This is in the documentation for the writer class, suggesting it is true for all dialects and is an intrinsic limitation of the csv module.

我支持改变这一点(以及 csv 模块的各种其他限制),但可能人们希望将此类工作卸载到不同的库中,并保持 CSV 模块简单(或在至少像它一样简单).

I for one would support changing this (along with various other limitations of the csv module), but it may be that people would want to offload this sort of work into a different library, and keep the CSV module simple (or at least as simple as it is).

如果您需要更强大的文件读取功能,您可能需要查看 numpy、scipy 和 Pandas 中的 CSV 读取功能,我记得它们有更多选择.

If you need more powerful file-reading capabilities, you might want to look at the CSV reading functions in numpy, scipy, and pandas, which as I recall have more options.

这篇关于没有和空字符串的 CSV 阅读器行为的文章就介绍到这了,希望我们推荐的答案对大家有所帮助,也希望大家多多支持IT屋!

查看全文
登录 关闭
扫码关注1秒登录
发送“验证码”获取 | 15天全站免登陆