Python3:UnicodeEncodeError:'ascii'编解码器无法编码字符'\ xfc' [英] Python3: UnicodeEncodeError: 'ascii' codec can't encode character '\xfc'
问题描述
我正在尝试在使用python 3.5.1的OSX上运行一个非常简单的示例,但是我真的很困惑.已经读了很多处理类似问题的文章,但我自己不能解决.您是否有解决此问题的提示?
I'am trying to get running a very simple example on OSX with python 3.5.1 but I'm really stucked. Have read so many articles that deal with similar problems but I can not fix this by myself. Do you have any hints how to resolve this issue?
我希望mylist中定义的编码latin-1输出正确无误.
I would like to have the correct encoded latin-1 output as defined in mylist without any errors.
我的代码:
# coding=<latin-1>
mylist = [u'Glück', u'Spaß', u'Ähre',]
print(mylist)
错误:
Traceback (most recent call last):
File "/Users/abc/test.py", line 4, in <module>
print(mylist)
UnicodeEncodeError: 'ascii' codec can't encode character '\xfc' in position 4: ordinal not in range(128)
我如何解决该错误,但仍然无法使用stdout(打印)解决问题:
mylist = [u'Glück', u'Spaß', u'Ähre',]
for w in mylist:
print(w.encode("latin-1"))
我得到的输出为:
b'Gl\xfcck'
b'Spa\xdf'
b'\xc4hre'
语言环境"向我显示的是什么:
What 'locale' shows me:
LANG="de_AT.UTF-8"
LC_COLLATE="de_AT.UTF-8"
LC_CTYPE="de_AT.UTF-8"
LC_MESSAGES="de_AT.UTF-8"
LC_MONETARY="de_AT.UTF-8"
LC_NUMERIC="de_AT.UTF-8"
LC_TIME="de_AT.UTF-8"
LC_ALL=
什么 ->'python3'告诉我:
What -> 'python3' shows me:
Python 3.5.1 (default, Jan 22 2016, 08:54:32)
[GCC 4.2.1 Compatible Apple LLVM 7.0.2 (clang-700.1.81)] on darwin
Type "help", "copyright", "credits" or "license" for more information.
>>> import sys
>>> sys.getdefaultencoding()
'utf-8'
推荐答案
删除字符<
和>
:
# coding=latin-1
在示例中经常使用这些字符来表示编码名称的位置,但是文本字符<
和>
不应包含在文件中.
Those character are often used in examples to indicate where the encoding name goes, but the literal characters <
and >
should not be included in your file.
要使其正常工作,您的文件必须使用latin-1进行编码.如果您的文件实际上是使用utf-8编码的,则编码行应为
For that to work, your file must be encoded using latin-1. If your file is actually encoded using utf-8, the encoding line should be
# coding=utf-8
例如,当我运行此脚本时(保存为具有latin-1编码的文件):
For example, when I run this script (saved as a file with latin-1 encoding):
# coding=latin-1
mylist = [u'Glück', u'Spaß', u'Ähre',]
print(mylist)
for w in mylist:
print(w.encode("latin-1"))
我得到这个输出(没有错误):
I get this output (with no errors):
['Glück', 'Spaß', 'Ähre']
b'Gl\xfcck'
b'Spa\xdf'
b'\xc4hre'
该输出看起来正确.例如,ü的latin-1编码为'\xfc'
.
That output looks correct. For example, the latin-1 encoding of ü is '\xfc'
.
我使用编辑器以latin-1编码保存文件.该文件的内容以十六进制为:
I used my editor to save the file with latin-1 encoding. The contents of the file in hexadecimal are:
$ hexdump -C codec-question.py
00000000 23 20 63 6f 64 69 6e 67 3d 6c 61 74 69 6e 2d 31 |# coding=latin-1|
00000010 0a 0a 6d 79 6c 69 73 74 20 3d 20 5b 75 27 47 6c |..mylist = [u'Gl|
00000020 fc 63 6b 27 2c 20 75 27 53 70 61 df 27 2c 20 75 |.ck', u'Spa.', u|
00000030 27 c4 68 72 65 27 2c 5d 0a 70 72 69 6e 74 28 6d |'.hre',].print(m|
00000040 79 6c 69 73 74 29 0a 0a 66 6f 72 20 77 20 69 6e |ylist)..for w in|
00000050 20 6d 79 6c 69 73 74 3a 0a 20 20 20 20 70 72 69 | mylist:. pri|
00000060 6e 74 28 77 2e 65 6e 63 6f 64 65 28 22 6c 61 74 |nt(w.encode("lat|
00000070 69 6e 2d 31 22 29 29 0a |in-1")).|
00000078
请注意,第三行(即位置0x20处的字符)的第一个字节(以十六进制表示)为fc
.那是ü的latin-1编码.如果 file 是使用utf-8编码的,则字符ü将使用两个字节c3 bc
表示.
Note that the first byte (represented in hexadecimal) in the third line (i.e. the character at position 0x20) is fc
. That is the latin-1 encoding of ü. If the file was encoded using utf-8, the character ü would be represented using two bytes, c3 bc
.
这篇关于Python3:UnicodeEncodeError:'ascii'编解码器无法编码字符'\ xfc'的文章就介绍到这了,希望我们推荐的答案对大家有所帮助,也希望大家多多支持IT屋!