如何将python csv.DictReader与二进制文件一起使用? (用于babel自定义提取方法) [英] How to use python csv.DictReader with a binary file? (For a babel custom extraction method)

查看:122
本文介绍了如何将python csv.DictReader与二进制文件一起使用? (用于babel自定义提取方法)的处理方法,对大家解决问题具有一定的参考价值,需要的朋友们下面随着小编来一起学习吧!

问题描述

我正在尝试为babel编写一种自定义提取方法,以从csv文件中的特定列中提取字符串.我在此处进行了操作.

I'm trying to write a custom extraction method for babel, to extract strings from a specific column in a csv file. I followed the documentation here.

这是我的提取方法代码:

Here is my extraction method code:

def extract_csv(fileobj, keywords, comment_tags, options):
    import csv
    reader = csv.DictReader(fileobj, delimiter=',')
    for row in reader:
        if row and row['caption'] != '':
            yield (reader.line_num, '', row['caption'], '')

当我尝试运行提取程序时,出现此错误:

When i try to run the extraction i get this error:

extract_csv中的文件"/Users/tiagosilva/repos/naltio/csv_extractor.py",第18行 对于阅读器中的行: 下一个中的文件"/usr/local/Cellar/python/3.6.5/Frameworks/Python.framework/Versions/3.6/lib/python3.6/csv.py",第111行 自我字段名 文件"/usr/local/Cellar/python/3.6.5/Frameworks/Python.framework/Versions/3.6/lib/python3.6/csv.py",行98,在字段名称中 self._fieldnames = next(self.reader) _csv.Error:迭代器应返回字符串,而不是字节(您是否以文本模式打开文件?)

File "/Users/tiagosilva/repos/naltio/csv_extractor.py", line 18, in extract_csv for row in reader: File "/usr/local/Cellar/python/3.6.5/Frameworks/Python.framework/Versions/3.6/lib/python3.6/csv.py", line 111, in next self.fieldnames File "/usr/local/Cellar/python/3.6.5/Frameworks/Python.framework/Versions/3.6/lib/python3.6/csv.py", line 98, in fieldnames self._fieldnames = next(self.reader) _csv.Error: iterator should return strings, not bytes (did you open the file in text mode?)

似乎以二进制模式打开了传递给该函数的 fileobj .

It seems the fileobj that is passed to the function was opened in binary mode.

如何进行这项工作?我可以想到2种可能的解决方案,但我不知道如何编写它们:

How to make this work? I can think of 2 possible solutions, but I don't know how to code them:

1)是否可以在DictReader中使用它?

1) is there a way to use it with DictReader?

2)是否有信号通知babel以文本模式打开文件?

2) Is there a way to signal babel to open the file in text mode?

我愿意接受其他未列出的解决方案.

I'm open to other non listed solutions.

推荐答案

我实际上找到了一种方法!

I actually found a way to do it!

解决方案1,一种处理二进制文件的方法.解决方案是将TextIOWrapper包裹在二进制文件周围,并对其进行解码,然后将其传递给DictReader.

It's solution 1, a way to handle a binary file. The solution is to wrap a TextIOWrapper around the binary file and decode it and pass that to the DictReader.

import csv
import io

with io.TextIOWrapper(fileobj, encoding='utf-8') as text_file:
    reader = csv.DictReader(text_file, delimiter=',')

    for row in reader:
        if row and 'caption' in row.keys():
            yield (reader.line_num, '', row['caption'], '')

这篇关于如何将python csv.DictReader与二进制文件一起使用? (用于babel自定义提取方法)的文章就介绍到这了,希望我们推荐的答案对大家有所帮助,也希望大家多多支持IT屋!

查看全文
登录 关闭
扫码关注1秒登录
发送“验证码”获取 | 15天全站免登陆