如何以与python2和python3一起工作的方式将utf8写入标准输出 [英] How to write utf8 to standard output in a way that works with python2 and python3

查看:153
本文介绍了如何以与python2和python3一起工作的方式将utf8写入标准输出的处理方法,对大家解决问题具有一定的参考价值,需要的朋友们下面随着小编来一起学习吧!

问题描述

我想写一个非ASCII字符,让标准输出。这个棘手的部分似乎是从json中读取我想连接到该字符串的一些数据。考虑以下简单的json文档:

  {foo:bar} 

我包括这个,因为如果我只想打印,那么看起来只要简单地写:

  print(→)

,它将在python2和python3中做正确的事情。



所以我想打印 foo 以及我的非ASCII字符。 python2和python3中唯一的方法是这样做:python2和python3是:

  getattr(sys.stdout, 'buffer',sys.stdout).write(data [foo]。encode(utf8)+ u→.encode(utf8))
pre>

  getattr(sys.stdout,'buffer' ,sys.stdout).write((data [foo] + u→)。encode(utf8))

重要的是不要错过前面的 u ,否则 UnicodeDecodeError 将被python2抛出。



使用 print 函数如下:

  print((data [foo] + u→)。encode(utf8) ,file =(getattr(sys.stdout,'buffer',sys.stdout)))

似乎工作,因为python3将抱怨 TypeError:'str'不支持缓冲区接口



我找到了最好的方法还是有更好的选择?我可以让打印功能工作吗?

解决方案

我可以想出的最简洁的是以下内容,您可能可以以更简洁的方式使用一些方便的功能(甚至替换/覆盖打印功能):

 # -  *  - utf-8  -  *  -  
import codecs
import os
import sys

#如果你包含 - * - 编码行,你可以使用这个
output ='bar'+ u'→'
#否则,使用此
output ='bar'+ b'\xe2\x86\x92'.decode('utf-8' )

如果sys.stdout.encoding =='UTF-8':
print(output)
else:
output + = os.linesep
如果sys.version_info [0]> = 3:
sys.stdout.buffer.write(bytes(output.encode('utf-8')))
else:
编解码器.getwriter('utf-8')(sys.stdout).write(output)

最好的选项是使用 - * - 编码行,这允许您使用文件中的实际字符。但是,如果由于某种原因,您不能使用编码行,仍然可以在没有编码行的情况下完成。



这(无论是否使用编码行)适用于Linux(Arch)与python 2.7.7和3.4.1。
如果终端的编码不是UTF-8,它也可以工作。 (在Arch Linux上,我只是使用不同的LANG环境变量来更改编码。)

  LANG = zh_CN python test.py 

它也是 2.7,3.3和3.4。通过排序,我的意思是我可以得到'→'字符只显示在一个虚拟终端上。在cmd终端上,该角色将显示为'ΓåÆ'。 (可能有一些简单的东西我在那里失踪。)


I want to write a non-ascii character, lets say to standard output. The tricky part seems to be that some of the data that I want to concatenate to that string is read from json. Consider the follwing simple json document:

{"foo":"bar"}

I include this because if I just want to print then it seems enough to simply write:

print("→")

and it will do the right thing in python2 and python3.

So I want to print the value of foo together with my non-ascii character . The only way I found to do this such that it works in both, python2 and python3 is:

getattr(sys.stdout, 'buffer', sys.stdout).write(data["foo"].encode("utf8")+u"→".encode("utf8"))

or

getattr(sys.stdout, 'buffer', sys.stdout).write((data["foo"]+u"→").encode("utf8"))

It is important to not miss the u in front of because otherwise a UnicodeDecodeError will be thrown by python2.

Using the print function like this:

print((data["foo"]+u"→").encode("utf8"), file=(getattr(sys.stdout, 'buffer', sys.stdout)))

doesnt seem to work because python3 will complain TypeError: 'str' does not support the buffer interface.

Did I find the best way or is there a better option? Can I make the print function work?

解决方案

The most concise I could come up with is the following, which you may be able to make more concise with a few convenience functions (or even replacing/overriding the print function):

# -*- coding=utf-8 -*-
import codecs
import os
import sys

# if you include the -*- coding line, you can use this
output = 'bar' + u'→'
# otherwise, use this
output = 'bar' + b'\xe2\x86\x92'.decode('utf-8')

if sys.stdout.encoding == 'UTF-8':
    print(output)
else:
    output += os.linesep
    if sys.version_info[0] >= 3:
        sys.stdout.buffer.write(bytes(output.encode('utf-8')))
    else:
        codecs.getwriter('utf-8')(sys.stdout).write(output)

The best option is using the -*- encoding line, which allows you to use the actual character in the file. But if for some reason, you can't use the encoding line, it's still possible to accomplish without it.

This (both with and without the encoding line) works on Linux (Arch) with python 2.7.7 and 3.4.1. It also works if the terminal's encoding is not UTF-8. (On Arch Linux, I just change the encoding by using a different LANG environment variable.)

LANG=zh_CN python test.py

It also sort of works on Windows, which I tried with 2.6, 2.7, 3.3, and 3.4. By sort of, I mean I could get the '→' character to display only on a mintty terminal. On a cmd terminal, that character would display as 'ΓåÆ'. (There may be something simple I'm missing there.)

这篇关于如何以与python2和python3一起工作的方式将utf8写入标准输出的文章就介绍到这了,希望我们推荐的答案对大家有所帮助,也希望大家多多支持IT屋!

查看全文
登录 关闭
扫码关注1秒登录
发送“验证码”获取 | 15天全站免登陆