Python,Docker-'ascii'编解码器无法编码字符 [英] Python, Docker - 'ascii' codec can't encode character

查看:60
本文介绍了Python,Docker-'ascii'编解码器无法编码字符的处理方法,对大家解决问题具有一定的参考价值,需要的朋友们下面随着小编来一起学习吧!

问题描述

我写了一个python3脚本,该脚本可以进行一些Web抓取并将一些信息存储在 CSV 文件中.该脚本可以在我的计算机上正常运行.当我尝试在Docker容器上运行脚本时会发生问题.该错误似乎出在我的代码的这一部分上(出于本问题的目的而进一步简化).

I wrote a python3 script that does some web scraping and stores some information on a CSV file. The script works fine on my computer. The problem happens when I try to run the script on a docker container. The error seems to be on this part of my code (simplified further for the purposes of this question).

# default CSV module
import csv

# this is how an ACTUAL row looks like in my program, included it in case it was important
row = {'title': 'Electrochemical sensor for the determination of dopamine in presence of high concentration of ascorbic acid using a Fullerene-C60 coated gold electrode', 'url': 'https://onlinelibrary.wiley.com/doi/abs/10.1002/elan.200704073', 'author': 'Goyal, Rajendra Nath and Gupta, Vinod Kumar and Bachheti, Neeta and Sharma, Ram Avatar', 'abstract': 'A fullerene‐C60‐modified gold electrode is employed for the determination of dopamine in the excess of ascorbic acid using square‐wave voltammetry. Based on its strong catalytic function towards the oxidation of dopamine and ascorbic acid, the overlapping voltammetric …', 'eprint': 'http://www.academia.edu/download/3909892/Dopamene.pdf', 'publisher': 'Wiley Online Library', 'year': '2008', 'pages': '757--764', 'number': '7', 'volume': '20', 'journal': 'Electroanalysis: An International Journal Devoted to Fundamental and Practical Aspects of Electroanalysis', 'ENTRYTYPE': 'article', 'ID': 'goyal2008electrochemical'}

# the CSV writer object
writer = csv.DictWriter("file.csv", fieldnames=[a, b, c],  dialect='toMYSQL')

# this is the source of the problem!
writer.writerow(row)

我了解容器仅具有裸露的骨骼,这意味着也许不支持脚本使用的编码.因此,我将其添加到了脚本的开头:(用通常的敲打声敲击)

I understand the containers have only the bare bones and that means that maybe the encoding the script uses is not supported. Thus, I added this to the start of my script: (bellow the usual she-bang)

# coding=utf-8

这些是我的docker上的语言环境:

These are the locales on my docker:

$ locale -a

C
C.UTF-8
POSIX
en_US.utf8
es_CR.utf8

我的PC上还有更多功能,但这应该不会有太大变化,因为en_US.utf8涵盖了所有英语内容,而es_CR.utf8涵盖了所有西班牙语内容.(即使不是全部,大部分(但不是全部)结果都是英文的.)

I have way more on my PC, but that shouldn't change much since en_US.utf8 covers all English stuff and es_CR.utf8 covers all Spanish stuff. (most, if not all, of my results are in English.)

我正在使用python3,所以我知道所有字符串都是unicode字符,也许与问题有关?

I'm using python3, so I know all strings are unicode characters, maybe thats related to the problem?

$ python3 --version
Python 3.6.5

尽管如此,当我运行程序时,脚本尝试在控制台上打印该行时,我会收到以下错误消息:

Despite all that, when I run my program, I get the following error message as soon as the script tries to print the row on console:

Exception in thread Thread-6:
Traceback (most recent call last):
  File "/usr/lib/python3.6/threading.py", line 916, in _bootstrap_inner
    self.run()
  File "/usr/lib/python3.6/threading.py", line 864, in run
    self._target(*self._args, **self._kwargs)
  File "/home/Systematic-Mapping-Engine/sysmapengine/scraper.py", line 100, in build_csv
    writer.writerow(clean_row)
  File "/usr/lib/python3.6/csv.py", line 155, in writerow
    return self.writer.writerow(self._dict_to_list(rowdict))
UnicodeEncodeError: 'ascii' codec can't encode character '\u2010' in position 262: ordinal not in range(128)

推荐答案

大多数容器都以 LANG = C 开头.如果您正在处理UTF-8,那可能会很烦人.

Most containers start with LANG=C set. That can be really annoying if you're dealing with UTF-8.

只是确保您的容器以正确的语言环境开头,所以在调用docker时添加 -e LANG = C.UTF-8 .

Just to make sure your container starts with the right locale add -e LANG=C.UTF-8 when calling docker.

这篇关于Python,Docker-'ascii'编解码器无法编码字符的文章就介绍到这了,希望我们推荐的答案对大家有所帮助,也希望大家多多支持IT屋!

查看全文
相关文章
登录 关闭
扫码关注1秒登录
发送“验证码”获取 | 15天全站免登陆