如何告诉Python sys.argv使用Unicode? [英] How do I tell Python that sys.argv is in Unicode?
问题描述
这是一个小程序:
import sys
f = sys.argv[1]
print type(f)
print u"f=%s" % (f)
这是我正在运行的程序:
Here is my running of the program:
$ python x.py 'Recent/רשימת משתתפים.LNK'
<type 'str'>
Traceback (most recent call last):
File "x.py", line 5, in <module>
print u"f=%s" % (f)
UnicodeDecodeError: 'ascii' codec can't decode byte 0xd7 in position 7: ordinal not in range(128)
$
问题在于sys.argv [1]认为它正在获取ascii字符串,无法将其转换为Unicode.但是我使用的Mac具有完整的支持Unicode的终端,因此x.py
实际上正在获取Unicode字符串.如何告诉Python sys.argv []是Unicode而不是Ascii?失败了,如何将ASCII(内部具有Unicode)转换为Unicode?明显的转换无效.
The problem is that sys.argv[1] is thinking that it's getting an ascii string, which it can't convert to Unicode. But I'm using a Mac with a full Unicode-aware Terminal, so x.py
is actually getting a Unicode string. How do I tell Python that sys.argv[] is Unicode and not Ascii? Failing that, how do I convert ASCII (that has unicode inside it) into Unicode? The obvious conversions don't work.
推荐答案
您看到的UnicodeDecodeError
错误是由于您混合了Unicode字符串u"f=%s"
和sys.argv[1]
字节字符串:
The UnicodeDecodeError
error you see is due to you're mixing the Unicode string u"f=%s"
and the sys.argv[1]
bytestring:
-
两个字节串:
both bytestrings:
$ python -c'import sys; print "f=%s" % (sys.argv[1],)' 'Recent/רשימת משתתפים'
这将字节透明地从/传递到您的终端.它适用于任何编码.
This passes bytes transparently from/to your terminal. It works for any encoding.
都是Unicode:
$ python -c'import sys; print u"f=%s" % (sys.argv[1].decode("utf-8"),)' 'Rec..
在这里,您应该用终端使用的编码替换'utf-8'
.如果终端不支持Unicode,则可以在此处使用sys.getfilesystemencoding()
.
Here you should replace 'utf-8'
by the encoding your terminal uses. You might use sys.getfilesystemencoding()
here if the terminal is not Unicode-aware.
两个命令都产生相同的输出:
Both commands produce the same output:
f=Recent/רשימת משתתפים
通常,应尽快将您认为是文本的字节串转换为Unicode.
In general you should convert bytestrings that you consider to be text to Unicode as soon as possible.
这篇关于如何告诉Python sys.argv使用Unicode?的文章就介绍到这了,希望我们推荐的答案对大家有所帮助,也希望大家多多支持IT屋!