__repr __()函数的最佳输出类型和编码实践? [英] Best output type and encoding practices for __repr__() functions?
问题描述
__ repr __()
, format()
和编码。 如果 __ repr __()
的输出被编码或是unicode字符串?对的结果是否有最佳编码? __repr __()
在Python中?我想输出的内容有非ASCII字符。 我使用Python 2.x,并希望编写可以轻松适应Python 3的代码。程序因此使用<_ p>
# - * - 编码:utf-8 - * -
从__future__导入unicode_literals,print_function# 'Hello'literal表示一个Unicode对象
这里有一些额外的问题,一直困扰着我,而我'寻找一个解决他们的解决方案:
- 打印到UTF-8终端应该工作(我有
sys.stdout.encoding
设置为UTF-8
,但如果其他情况也是最好的)。 - 将文件输出到一个文件(以UTF-8编码)应该工作(在这种情况下,
sys.stdout.encoding
是无
)。 - 我的代码许多
__ repr __()
函数目前有很多返回... .encode('utf-8')
,那很重。有没有什么可靠的和更轻的? - 在某些情况下,我甚至有丑的野兽,如
return('< {}>'。format(repr ).decode('utf-8')))。encode('utf-8')
,即对象的表示被解码,放入格式化字符串,然后重新编码。我想避免这种复杂的变革。
为了写简单的 __ repr__,你建议做什么()
在这些编码问题上表现得很好的函数?
在Python2中, __ repr __
(和 __ str __
)必须返回一个字符串对象,而不是
unicode对象。在Python3中,情况相反, __ repr __
和 __ str __
必须返回unicode对象,而不是字节(née字符串)对象:
class Foo(object):
def __repr __(self):
return u '\N {WHITE SMILING FACE}'
class Bar(object):
def __repr __(self):
return u'\N {WHITE SMILING FACE} $($)
repr(Bar())
#☺
repr(Foo())
#UnicodeEncodeError:'ascii'在位置0中编码字符u'\\\☺':序号不在范围内(128)
Python2,你真的没有选择。您必须为
返回值 __ repr __
选择一个编码。
顺便说一句,您阅读了 PrintFails wiki ?它可能不会直接回答
您的其他问题,但我确实发现有助于了解为什么某些
错误发生。
当从__future__导入unicode_literals 使用时,
''{}>'。format(repr(x).decode('utf-8')))。encode('utf-8')
可以更简单地写成
str('< { }>')。format(repr(x))
假设 str
编码到您的系统上的 utf-8
。
没有<_ c $ c>从__future__导入unicode_literals ,表达式可以写成:
'< {}>'。format(repr(x))
Lately, I've had lots of trouble with __repr__()
, format()
, and encodings. Should the output of __repr__()
be encoded or be a unicode string? Is there a best encoding for the result of __repr__()
in Python? What I want to output does have non-ASCII characters.
I use Python 2.x, and want to write code that can easily be adapted to Python 3. The program thus uses
# -*- coding: utf-8 -*-
from __future__ import unicode_literals, print_function # The 'Hello' literal represents a Unicode object
Here are some additional problems that have been bothering me, and I'm looking for a solution that solves them:
- Printing to an UTF-8 terminal should work (I have
sys.stdout.encoding
set toUTF-8
, but it would be best if other cases worked too). - Piping the output to a file (encoded in UTF-8) should work (in this case,
sys.stdout.encoding
isNone
). - My code for many
__repr__()
functions currently has manyreturn ….encode('utf-8')
, and that's heavy. Is there anything robust and lighter? - In some cases, I even have ugly beasts like
return ('<{}>'.format(repr(x).decode('utf-8'))).encode('utf-8')
, i.e., the representation of objects is decoded, put into a formatting string, and then re-encoded. I would like to avoid such convoluted transformations.
What would you recommend to do in order to write simple __repr__()
functions that behave nicely with respect to these encoding questions?
In Python2, __repr__
(and __str__
) must return a string object, not a
unicode object. In Python3, the situation is reversed, __repr__
and __str__
must return unicode objects, not byte (née string) objects:
class Foo(object):
def __repr__(self):
return u'\N{WHITE SMILING FACE}'
class Bar(object):
def __repr__(self):
return u'\N{WHITE SMILING FACE}'.encode('utf8')
repr(Bar())
# ☺
repr(Foo())
# UnicodeEncodeError: 'ascii' codec can't encode character u'\u263a' in position 0: ordinal not in range(128)
In Python2, you don't really have a choice. You have to pick an encoding for the
return value of __repr__
.
By the way, have you read the PrintFails wiki? It may not directly answer your other questions, but I did find it helpful in illuminating why certain errors occur.
When using from __future__ import unicode_literals
,
'<{}>'.format(repr(x).decode('utf-8'))).encode('utf-8')
can be more simply written as
str('<{}>').format(repr(x))
assuming str
encodes to utf-8
on your system.
Without from __future__ import unicode_literals
, the expression can be written as:
'<{}>'.format(repr(x))
这篇关于__repr __()函数的最佳输出类型和编码实践?的文章就介绍到这了,希望我们推荐的答案对大家有所帮助,也希望大家多多支持IT屋!