Python 3:如何指定stdin编码 [英] Python 3: How to specify stdin encoding
问题描述
在sys.stdin中的行:
...
但是,Python 3希望从 sys.stdin 中的ASCII,如果有非ASCII字符在输入中,我得到错误:
UnicodeDecodeError:'ascii'编解码器无法解码字节..在位置..:ordinal不在范围(128)
对于常规文件,我打开文件时将指定编码:
with open('filename','r',encoding ='utf-8')as file:
for line in file:
...
但是如何指定标准输入的编码?其他SO帖子建议使用
input_stream = codecs.getreader('utf-8')(sys.stdin)
for input in input_stream:
...
但是,这不工作Python 3.我仍然收到相同的错误消息。我使用Ubuntu 12.04.2,我的区域设置设置为en_US.UTF-8。
Python 3 >不期望从 sys.stdin
中的ASCII。它将以文本模式打开 stdin
,并对使用的编码做出有根据的猜测。这个猜测可能会下降到 ASCII
,但这不是给定的。请参阅 sys.stdin
文档如何选择编解码器。
像在文本模式下打开的其他文件对象一样, sys.stdin
object来源于 io.TextIOBase
基类;它有一个 .buffer
属性指向底层缓冲的IO实例(依次具有一个 .raw
属性)。
将新的 sys.stdin.buffer python.org/3/library/io.html#io.TextIOWrapperrel =noreferrer> io.TextIOWrapper()
实例指定一个不同的编码:
import io
import sys
input_stream = io.TextIOWrapper(sys.stdin .buffer,encoding ='utf-8')
或者,设置 PYTHONIOENCODING
环境变量到所需的运行python时的编解码器。
While porting code from Python 2 to Python 3, I run into this problem when reading UTF-8 text from standard input. In Python 2, this works fine:
for line in sys.stdin:
...
But Python 3 expects ASCII from sys.stdin, and if there are non-ASCII characters in the input, I get the error:
UnicodeDecodeError: 'ascii' codec can't decode byte .. in position ..: ordinal not in range(128)
For a regular file, I would specify the encoding when opening the file:
with open('filename', 'r', encoding='utf-8') as file:
for line in file:
...
But how can I specify the encoding for standard input? Other SO posts have suggested using
input_stream = codecs.getreader('utf-8')(sys.stdin)
for line in input_stream:
...
However, this doesn't work in Python 3. I still get the same error message. I'm using Ubuntu 12.04.2 and my locale is set to en_US.UTF-8.
Python 3 does not expect ASCII from sys.stdin
. It'll open stdin
in text mode and make an educated guess as to what encoding is used. That guess may come down to ASCII
, but that is not a given. See the sys.stdin
documentation on how the codec is selected.
Like other file objects opened in text mode, the sys.stdin
object derives from the io.TextIOBase
base class; it has a .buffer
attribute pointing to the underlying buffered IO instance (which in turn has a .raw
attribute).
Wrap the sys.stdin.buffer
attribute in a new io.TextIOWrapper()
instance to specify a different encoding:
import io
import sys
input_stream = io.TextIOWrapper(sys.stdin.buffer, encoding='utf-8')
Alternatively, set the PYTHONIOENCODING
environment variable to the desired codec when running python.
这篇关于Python 3:如何指定stdin编码的文章就介绍到这了,希望我们推荐的答案对大家有所帮助,也希望大家多多支持IT屋!