Python和gettext的UTF-8错误 [英] UTF-8 error with Python and gettext

查看:176
本文介绍了Python和gettext的UTF-8错误的处理方法,对大家解决问题具有一定的参考价值,需要的朋友们下面随着小编来一起学习吧!

问题描述

我在编辑器中使用UTF-8,因此此处显示的所有字符串均为文件中的UTF-8.

I use UTF-8 in my editor, so all strings displayed here are UTF-8 in file.

我有一个像这样的python脚本:

I have a python script like this:

# -*- coding: utf-8 -*-
...
parser = optparse.OptionParser(
  description=_('automates the dice rolling in the classic game "risk"'), 
  usage=_("usage: %prog attacking defending"))

然后我使用xgettext提取所有内容,并得到一个.pot文件,该文件可以简化为:

Then I used xgettext to get everything out and got a .pot file which can be boiled down to:

"Content-Type: text/plain; charset=CHARSET\n"
"Content-Transfer-Encoding: 8bit\n"

#: auto_dice.py:16
msgid "automates the dice rolling in the classic game \"risk\""
msgstr ""

在那之后,我使用msginit获得了一个de.po,我将其填写为:

After that, I used msginit to get a de.po which I filled in like this:

"Content-Type: text/plain; charset=UTF-8\n"
"Content-Transfer-Encoding: 8bit\n"

#: auto_dice.py:16
msgid "automates the dice rolling in the classic game \"risk\""
msgstr "automatisiert das Würfeln bei \"Risiko\""

运行脚本,出现以下错误:

Running the script, I get the following error:

  File "/usr/lib/python2.6/optparse.py", line 1664, in print_help
    file.write(self.format_help().encode(encoding, "replace"))
UnicodeDecodeError: 'ascii' codec can't decode byte 0xc3 in position 60: ordinal not in range(128)

我该如何解决?

推荐答案

该错误表示您已在字节字符串上调用了编码,因此它将尝试使用系统默认编码(在Python 2上为ascii)将其解码为Unicode,然后用您指定的内容重新编码.

That error means you've called encode on a bytestring, so it tries to decode it to Unicode using the system default encoding (ascii on Python 2), then re-encode it with whatever you've specified.

通常,解决该问题的方法是在尝试使用字符串之前,先调用s.decode('utf-8')(或字符串所包含的任何编码).如果您只使用unicode文字,它也可能起作用:u'automates...'(这取决于如何从.po文件中替换字符串,我不知道).

Generally, the way to resolve it is to call s.decode('utf-8') (or whatever encoding the strings are in) before trying to use the strings. It might also work if you just use unicode literals: u'automates...' (that depends on how strings are substituted from .po files, which I don't know about).

这种令人困惑的行为在Python 3中得到了改进,除非您特别告知,否则它将不会尝试将字节转换为unicode.

This sort of confusing behaviour is improved in Python 3, which won't try to convert bytes to unicode unless you specifically tell it to.

这篇关于Python和gettext的UTF-8错误的文章就介绍到这了,希望我们推荐的答案对大家有所帮助,也希望大家多多支持IT屋!

查看全文
登录 关闭
扫码关注1秒登录
发送“验证码”获取 | 15天全站免登陆