Python CSV DictReader,带有UTF-8数据 [英] Python CSV DictReader with UTF-8 data
问题描述
AFAIK,Python(v2.6)csv模块默认情况下不能处理unicode数据,是否正确?在Python文档中,有关于如何从UTF-8编码文件读取的示例。但此示例仅返回CSV列作为列表。
我想按照名称访问行列,因为它是由 csv.DictReader
,但使用UTF-8编码的CSV输入文件。
AFAIK, the Python (v2.6) csv module can't handle unicode data by default, correct? In the Python docs there's an example on how to read from a UTF-8 encoded file. But this example only returns the CSV rows as a list.
I'd like to access the row columns by name as it is done by csv.DictReader
but with UTF-8 encoded CSV input file.
任何人都可以告诉我如何以有效的方式做到这一点?
Can anyone tell me how to do this in an efficient way? I will have to process CSV files in 100's of MByte in size.
推荐答案
实际上,我自己想出了一个答案(对不起用于回复我自己的问题):
Actually, I came up with an answer myself (sorry for replying to my own question):
def UnicodeDictReader(utf8_data, **kwargs):
csv_reader = csv.DictReader(utf8_data, **kwargs)
for row in csv_reader:
yield {unicode(key, 'utf-8'):unicode(value, 'utf-8') for key, value in row.iteritems()}
这篇关于Python CSV DictReader,带有UTF-8数据的文章就介绍到这了,希望我们推荐的答案对大家有所帮助,也希望大家多多支持IT屋!