您如何以文本而非字节的形式读取zip文件中的文件? [英] How do you read a file inside a zip file as text, not bytes?

查看:112
本文介绍了您如何以文本而非字节的形式读取zip文件中的文件?的处理方法,对大家解决问题具有一定的参考价值,需要的朋友们下面随着小编来一起学习吧!

问题描述

在zip文件中读取CSV文件的简单程序在Python 2.7中有效,但在Python 3.2中不可用

A simple program for reading a CSV file inside a zip file works in Python 2.7, but not in Python 3.2

$ cat test_zip_file_py3k.py 
import csv, sys, zipfile

zip_file    = zipfile.ZipFile(sys.argv[1])
items_file  = zip_file.open('items.csv', 'rU')

for row in csv.DictReader(items_file):
    pass

$ python2.7 test_zip_file_py3k.py ~/data.zip

$ python3.2 test_zip_file_py3k.py ~/data.zip
Traceback (most recent call last):
  File "test_zip_file_py3k.py", line 8, in <module>
    for row in csv.DictReader(items_file):
  File "/home/msabramo/run/lib/python3.2/csv.py", line 109, in __next__
    self.fieldnames
  File "/home/msabramo/run/lib/python3.2/csv.py", line 96, in fieldnames
    self._fieldnames = next(self.reader)
_csv.Error: iterator should return strings, not bytes (did you open the file 
in text mode?)

因此,Python 3中的csv模块希望查看文本文件,但是zipfile.ZipFile.open返回的zipfile.ZipExtFile始终被视为二进制数据.

So the csv module in Python 3 wants to see a text file, but zipfile.ZipFile.open returns a zipfile.ZipExtFile that is always treated as binary data.

如何使它在Python 3中工作?

How does one make this work in Python 3?

推荐答案

我刚刚注意到 Python 3.2 一起使用.他们在Python 3.2中增强了 zipfile.ZipExtFile 请参见发行说明).这些更改似乎使zipfile.ZipExtFile io.TextWrapper .

I just noticed that Lennart's answer didn't work with Python 3.1, but it does work with Python 3.2. They've enhanced zipfile.ZipExtFile in Python 3.2 (see release notes). These changes appear to make zipfile.ZipExtFile work nicely with io.TextWrapper.

顺便说一句,如果您取消注释下面的hacky行来进行猴子补丁zipfile.ZipExtFile的注释,它可以在Python 3.1中运行,不是我会推荐这种黑客.我包括它只是为了说明在Python 3.2中所做的实质,以使事情顺利进行.

Incidentally, it works in Python 3.1, if you uncomment the hacky lines below to monkey-patch zipfile.ZipExtFile, not that I would recommend this sort of hackery. I include it only to illustrate the essence of what was done in Python 3.2 to make things work nicely.

$ cat test_zip_file_py3k.py 
import csv, io, sys, zipfile

zip_file    = zipfile.ZipFile(sys.argv[1])
items_file  = zip_file.open('items.csv', 'rU')
# items_file.readable = lambda: True
# items_file.writable = lambda: False
# items_file.seekable = lambda: False
# items_file.read1 = items_file.read
items_file  = io.TextIOWrapper(items_file)

for idx, row in enumerate(csv.DictReader(items_file)):
    print('Processing row {0} -- row = {1}'.format(idx, row))

如果我必须支持py3k< 3.2,然后我会在

If I had to support py3k < 3.2, then I would go with the solution in my other answer.

这篇关于您如何以文本而非字节的形式读取zip文件中的文件?的文章就介绍到这了,希望我们推荐的答案对大家有所帮助,也希望大家多多支持IT屋!

查看全文
登录 关闭
扫码关注1秒登录
发送“验证码”获取 | 15天全站免登陆