__repr __()函数的最佳输出类型和编码实践? [英] Best output type and encoding practices for __repr__() functions?

查看:127
本文介绍了__repr __()函数的最佳输出类型和编码实践?的处理方法,对大家解决问题具有一定的参考价值,需要的朋友们下面随着小编来一起学习吧!

问题描述

最近,我已经遇到了很多麻烦: __ repr __() format()和编码。 如果 __ repr __()的输出被编码或是unicode字符串?的结果是否有最佳编码? __repr __()在Python中?我想输出的内容有非ASCII字符。



我使用Python 2.x,并希望编写可以轻松适应Python 3的代码。程序因此使用<_ p>

 # -  *  - 编码:utf-8  -  *  -  
从__future__导入unicode_literals,print_function# 'Hello'literal表示一个Unicode对象

这里有一些额外的问题,一直困扰着我,而我'寻找一个解决他们的解决方案:


  1. 打印到UTF-8终端应该工作(我有 sys.stdout.encoding 设置为 UTF-8 ,但如果其他情况也是最好的)。

  2. 将文件输出到一个文件(以UTF-8编码)应该工作(在这种情况下, sys.stdout.encoding )。

  3. 我的代码许多 __ repr __()函数目前有很多返回... .encode('utf-8'),那很重。有没有什么可靠的和更轻的?

  4. 在某些情况下,我甚至有丑的野兽,如 return('< {}>'。format(repr ).decode('utf-8')))。encode('utf-8'),即对象的表示被解码,放入格式化字符串,然后重新编码。我想避免这种复杂的变革。

为了写简单的 __ repr__,你建议做什么()在这些编码问题上表现得很好的函数?

解决方案

在Python2中, __ repr __ (和 __ str __ )必须返回一个字符串对象,而不是
unicode对象。在Python3中,情况相反, __ repr __ __ str __
必须返回unicode对象,而不是字节(née字符串)对象:

  class Foo(object):
def __repr __(self):
return u '\N {WHITE SMILING FACE}'

class Bar(object):
def __repr __(self):
return u'\N {WHITE SMILING FACE} $($)

repr(Bar())
#☺
repr(Foo())
#UnicodeEncodeError:'ascii'在位置0中编码字符u'\\\☺':序号不在范围内(128)

Python2,你真的没有选择。您必须为
返回值 __ repr __ 选择一个编码。



顺便说一句,您阅读了 PrintFails wiki ?它可能不会直接回答
您的其他问题,但我确实发现有助于了解为什么某些
错误发生。






当从__future__导入unicode_literals 使用时,

  ''{}>'。format(repr(x).decode('utf-8')))。encode('utf-8')

可以更简单地写成

  str('< { }>')。format(repr(x))

假设 str 编码到您的系统上的 utf-8



没有<_ c $ c>从__future__导入unicode_literals ,表达式可以写成:

 '< {}>'。format(repr(x))


Lately, I've had lots of trouble with __repr__(), format(), and encodings. Should the output of __repr__() be encoded or be a unicode string? Is there a best encoding for the result of __repr__() in Python? What I want to output does have non-ASCII characters.

I use Python 2.x, and want to write code that can easily be adapted to Python 3. The program thus uses

# -*- coding: utf-8 -*-
from __future__ import unicode_literals, print_function  # The 'Hello' literal represents a Unicode object

Here are some additional problems that have been bothering me, and I'm looking for a solution that solves them:

  1. Printing to an UTF-8 terminal should work (I have sys.stdout.encoding set to UTF-8, but it would be best if other cases worked too).
  2. Piping the output to a file (encoded in UTF-8) should work (in this case, sys.stdout.encoding is None).
  3. My code for many __repr__() functions currently has many return ….encode('utf-8'), and that's heavy. Is there anything robust and lighter?
  4. In some cases, I even have ugly beasts like return ('<{}>'.format(repr(x).decode('utf-8'))).encode('utf-8'), i.e., the representation of objects is decoded, put into a formatting string, and then re-encoded. I would like to avoid such convoluted transformations.

What would you recommend to do in order to write simple __repr__() functions that behave nicely with respect to these encoding questions?

解决方案

In Python2, __repr__ (and __str__) must return a string object, not a unicode object. In Python3, the situation is reversed, __repr__ and __str__ must return unicode objects, not byte (née string) objects:

class Foo(object):
    def __repr__(self):
        return u'\N{WHITE SMILING FACE}' 

class Bar(object):
    def __repr__(self):
        return u'\N{WHITE SMILING FACE}'.encode('utf8')

repr(Bar())
# ☺
repr(Foo())
# UnicodeEncodeError: 'ascii' codec can't encode character u'\u263a' in position 0: ordinal not in range(128)

In Python2, you don't really have a choice. You have to pick an encoding for the return value of __repr__.

By the way, have you read the PrintFails wiki? It may not directly answer your other questions, but I did find it helpful in illuminating why certain errors occur.


When using from __future__ import unicode_literals,

'<{}>'.format(repr(x).decode('utf-8'))).encode('utf-8')

can be more simply written as

str('<{}>').format(repr(x))

assuming str encodes to utf-8 on your system.

Without from __future__ import unicode_literals, the expression can be written as:

'<{}>'.format(repr(x))

这篇关于__repr __()函数的最佳输出类型和编码实践?的文章就介绍到这了,希望我们推荐的答案对大家有所帮助,也希望大家多多支持IT屋!

查看全文
登录 关闭
扫码关注1秒登录
发送“验证码”获取 | 15天全站免登陆