在Python中创建一个utf-8 csv文件 [英] Create an utf-8 csv file in Python
问题描述
我无法在Python中创建utf-8 csv文件。
I can't create an utf-8 csv file in Python.
我试图读取它的文档,并在示例部分,它说:
I'm trying to read it's docs, and in the examples section, it says:
对于所有其他编码,可以使用以下
UnicodeReader和UnicodeWriter
类。他们在
构造函数中接受一个
的附加编码参数,并确保
数据通过真正的读者或作者
编码为UTF-8:
For all other encodings the following UnicodeReader and UnicodeWriter classes can be used. They take an additional encoding parameter in their constructor and make sure that the data passes the real reader or writer encoded as UTF-8:
好的。所以我有这个代码:
Ok. So I have this code:
values = (unicode("Ñ", "utf-8"), unicode("é", "utf-8"))
f = codecs.open('eggs.csv', 'w', encoding="utf-8")
writer = UnicodeWriter(f)
writer.writerow(values)
我一直得到这个错误:
line 159, in writerow
self.stream.write(data)
File "/usr/lib/python2.6/codecs.py", line 686, in write
return self.writer.write(data)
File "/usr/lib/python2.6/codecs.py", line 351, in write
data, consumed = self.encode(object, self.errors)
UnicodeDecodeError: 'ascii' codec can't decode byte 0xc3 in position 22: ordinal not in range(128)
有人可以给我一个灯,所以我可以理解我做错了什么,因为我在调用UnicodeWriter类之前设置所有的编码无处不在?
Can someone please give me a light so I can understand what the hell am I doing wrong since I set all the encoding everywhere before calling UnicodeWriter class?
class UnicodeWriter:
"""
A CSV writer which will write rows to CSV file "f",
which is encoded in the given encoding.
"""
def __init__(self, f, dialect=csv.excel, encoding="utf-8", **kwds):
# Redirect output to a queue
self.queue = cStringIO.StringIO()
self.writer = csv.writer(self.queue, dialect=dialect, **kwds)
self.stream = f
self.encoder = codecs.getincrementalencoder(encoding)()
def writerow(self, row):
self.writer.writerow([s.encode("utf-8") for s in row])
# Fetch UTF-8 output from the queue ...
data = self.queue.getvalue()
data = data.decode("utf-8")
# ... and reencode it into the target encoding
data = self.encoder.encode(data)
# write to the target stream
self.stream.write(data)
# empty queue
self.queue.truncate(0)
def writerows(self, rows):
for row in rows:
self.writerow(row)
推荐答案
使用 codecs.open
; UnicodeWriter
使用Unicode输入并负责将所有内容编码为UTF-8。当 UnicodeWriter
写入您传递给它的文件句柄时,所有内容都是以UTF-8编码的(因此它与您使用打开
)。
You don't have to use codecs.open
; UnicodeWriter
takes Unicode input and takes care of encoding everything into UTF-8. When UnicodeWriter
writes into the file handle you passed to it, everything is already in UTF-8 encoding (therefore it works with a normal file you opened with open
).
通过使用 codecs.open
到 UnicodeWriter
中的UTF-8字符串,然后尝试将这些字符串重新编码为UTF-8,如同这些字符串包含Unicode字符串,这显然失败。
By using codecs.open
, you essentially convert your Unicode objects to UTF-8 strings in UnicodeWriter
, then try to re-encode these strings into UTF-8 again as if these strings contained Unicode strings, which obviously fails.
这篇关于在Python中创建一个utf-8 csv文件的文章就介绍到这了,希望我们推荐的答案对大家有所帮助,也希望大家多多支持IT屋!