如何使python 3 print()utf8 [英] How to make python 3 print() utf8

查看:677
本文介绍了如何使python 3 print()utf8的处理方法,对大家解决问题具有一定的参考价值,需要的朋友们下面随着小编来一起学习吧!

问题描述

如何在UTF-8中将python 3(3.1) print(Some text) to stdout,或者如何输出原始字节?

How can I make python 3 (3.1) print("Some text") to stdout in UTF-8, or how to output raw bytes?

TestText = "Test - āĀēĒčČ..šŠūŪžŽ" # this is UTF-8
TestText2 = b"Test2 - \xc4\x81\xc4\x80\xc4\x93\xc4\x92\xc4\x8d\xc4\x8c..\xc5\xa1\xc5\xa0\xc5\xab\xc5\xaa\xc5\xbe\xc5\xbd" # just bytes
print(sys.getdefaultencoding())
print(sys.stdout.encoding)
print(TestText)
print(TestText.encode("utf8"))
print(TestText.encode("cp1252","replace"))
print(TestText2)

输出(在CP1257中,我将字符替换为字节值 [x00] ):

Output (in CP1257 and I replaced chars to byte values [x00]):

utf-8
cp1257
Test - [xE2][xC2][xE7][C7][xE8][xC8]..[xF0][xD0][xFB][xDB][xFE][xDE]  
b'Test - \xc4\x81\xc4\x80\xc4\x93\xc4\x92\xc4\x8d\xc4\x8c..\xc5\xa1\xc5\xa0\xc5\xab\xc5\xaa\xc5\xbe\xc5\xbd'
b'Test - ??????..\x9a\x8a??\x9e\x8e'
b'Test2 - \xc4\x81\xc4\x80\xc4\x93\xc4\x92\xc4\x8d\xc4\x8c..\xc5\xa1\xc5\xa0\xc5\xab\xc5\xaa\xc5\xbe\xc5\xbd'

打印只是太聪明了... D没有必要使用编码文本与 print (因为它总是只显示字节的非字节表示),并且根本无法输出字节,因为打印总是在 sys.stdout.encoding

print is just too smart... :D There's no point using encoded text with print (since it always show only representation of bytes not real bytes) and it's impossible to output bytes at all, because print anyway and always encodes it in sys.stdout.encoding.

例如: print(chr(255))抛出一个错误:


Traceback (most recent call last):
  File "Test.py", line 1, in <module>
    print(chr(255));
  File "H:\Python31\lib\encodings\cp1257.py", line 19, in encode
    return codecs.charmap_encode(input,self.errors,encoding_table)[0]
UnicodeEncodeError: 'charmap' codec can't encode character '\xff' in position 0: character maps to <undefined>


顺便说一句 print(TestText = = TestText2.decode(utf8))返回 False ,虽然打印输出是一样的。

By the way print( TestText == TestText2.decode("utf8")) returns False, although print output is the same.

Python 3如何确定 sys.stdout.encoding ,如何更改?

How does Python 3 determine sys.stdout.encoding and how can I change it?

我做了一个 printRAW()函数,它工作正常(实际上它将输出编码为UTF-8,所以真的不是原始的。 ..)

I made a printRAW() function which works fine (actually it encodes output to UTF-8, so really it's not raw...):

 def printRAW(*Text):
     RAWOut = open(1, 'w', encoding='utf8', closefd=False)
     print(*Text, file=RAWOut)
     RAWOut.flush()
     RAWOut.close()

 printRAW("Cool", TestText)

输出(现在以UTF- 8):

Output (now it print in UTF-8):


Cool Test - āĀēĒčČ..šŠūŪžŽ


printRAW(chr(252))打印ü(以UTF-8, [xC3] [xBC] ),没有错误:)

printRAW(chr(252)) also nicely prints ü (in UTF-8, [xC3][xBC]) and without errors :)

现在我正在寻找可能更好的解决方案,如果有...

Now I'm looking for maybe better solution if there's any...

推荐答案

首先,更正:

TestText = "Test - āĀēĒčČ..šŠūŪžŽ" # this NOT utf-8...it is a Unicode string in Python 3.X.
TestText2 = TestText.encode('utf8') # THIS is "just bytes" in UTF-8.

现在,要将UTF-8发送到stdout,无论控制台的编码如何,请使用正确的工具工作:

Now, to send UTF-8 to stdout, regardless of the console's encoding, use the right tool for the job:

import sys
sys.stdout.buffer.write(TestText2)

缓冲区是stdout的原始界面。

"buffer" is a raw interface to stdout.

这篇关于如何使python 3 print()utf8的文章就介绍到这了,希望我们推荐的答案对大家有所帮助,也希望大家多多支持IT屋!

查看全文
登录 关闭
扫码关注1秒登录
发送“验证码”获取 | 15天全站免登陆