Python重载原语 [英] Python overload primitives
问题描述
我正在尝试重载内置字符串的某些方法. 我知道没有真正合法的用例,但是这种行为仍然困扰着我,所以我想对这里发生的事情进行解释:
I'm trying to overload some methods of the string builtin. I know there is no really legitimate use-case for this, but the behavior still bugs me so I would like to get an explanation of what is happening here:
使用Python2和forbiddenfruit
模块.
Using Python2, and the forbiddenfruit
module.
>>> from forbiddenfruit import curse
>>> curse(str, '__repr__', lambda self:'bar')
>>> 'foo'
'foo'
>>> 'foo'.__repr__()
'bar'
如您所见,__repr__
函数已成功重载,但当我们要求表示形式时实际上并未调用.为什么呢?
As you can see, the __repr__
function as been successfully overloaded, but isn't actually called when when we ask for a representation. Why is that?
然后,您将如何获得预期的行为:
Then, how would you do to get the expected behaviour:
>>> 'foo'
'bar'
设置自定义环境没有任何限制,如果需要重建python,就可以了,但是我真的不知道从哪里开始,我仍然希望有一种更简单的方法:)>
There is no constraint about setting up a custom environment, if rebuilding python is what it takes, so be it, but I really don't know where to start, and I still hope there is a easier way :)
推荐答案
首先要注意的是,无论forbiddenfruit
在做什么,它根本不会影响repr
.对于str
而言,这不是特殊情况,只是无法正常工作:
The first thing to note is that whatever forbiddenfruit
is doing, it's not affecting repr
at all. This isn't a special case for str
, it just doesn't work like that:
import forbiddenfruit
class X:
repr = None
repr(X())
#>>> '<X object at 0x7f907acf4c18>'
forbiddenfruit.curse(X, "__repr__", lambda self: "I am X")
repr(X())
#>>> '<X object at 0x7f907acf4c50>'
X().__repr__()
#>>> 'I am X'
X.__repr__ = X.__repr__
repr(X())
#>>> 'I am X'
我最近发现一种更简单的方法来实现forbiddenfruit
的作用,谢谢到 HYRY 的帖子:
I recently found a much simpler way of doing what forbiddenfruit
does thanks to a post by HYRY:
import gc
underlying_dict = gc.get_referents(str.__dict__)[0]
underlying_dict["__repr__"] = lambda self: print("I am a str!")
"hello".__repr__()
#>>> I am a str!
repr("hello")
#>>> "'hello'"
因此,从某种角度上讲,我们知道发生了其他事情.
So we know, somewhat anticlimactically, that something else is going on.
这是 builtin_repr
的来源 :
Here's the source for builtin_repr
:
builtin_repr(PyModuleDef *module, PyObject *obj)
/*[clinic end generated code: output=988980120f39e2fa input=a2bca0f38a5a924d]*/
{
return PyObject_Repr(obj);
}
对于 PyObject_Repr
(各节消失):
And for PyObject_Repr
(sections elided):
PyObject *
PyObject_Repr(PyObject *v)
{
PyObject *res;
res = (*v->ob_type->tp_repr)(v);
if (res == NULL)
return NULL;
}
重要的一点是,它不是在dict
中查找,而是在"ccached" tp_repr
属性中查找.
The important point is that instead of looking up in a dict
, it looks up the "cached" tp_repr
attribute.
在设置属性,例如TYPE.__repr__ = new_repr
:
static int
type_setattro(PyTypeObject *type, PyObject *name, PyObject *value)
{
if (!(type->tp_flags & Py_TPFLAGS_HEAPTYPE)) {
PyErr_Format(
PyExc_TypeError,
"can't set attributes of built-in/extension type '%s'",
type->tp_name);
return -1;
}
if (PyObject_GenericSetAttr((PyObject *)type, name, value) < 0)
return -1;
return update_slot(type, name);
}
第一部分是阻止您修改内置类型的东西.然后,它以通用方式(PyObject_GenericSetAttr
)设置属性,并至关重要地更新插槽.
The first part is the thing preventing you from modifying built-in types. Then it sets the attribute generically (PyObject_GenericSetAttr
) and, crucially, updates the slots.
如果您对它的工作方式感兴趣,请它是在此处可用.关键点是:
If you're interested in how that works, it's available here. The crucial points are:
-
这不是导出函数,并且
It's not an exported function and
它修改PyTypeObject
实例本身
因此要进行复制,就需要侵入PyTypeObject
类型本身.
so replicating it would require hacking into the PyTypeObject
type itself.
如果要这样做,可能最容易尝试的方法是(临时?)在str
类上设置type->tp_flags & Py_TPFLAGS_HEAPTYPE
.这样可以正常设置属性. 当然,我们不能保证这不会使您的解释器崩溃.
除非我确实需要,否则这不是我想要做的(特别是不是通过ctypes
),所以我为您提供了一条捷径.
This is not what I want to do (especially not through ctypes
) unless I really have to, so I offer you a shortcut.
您写:
然后,您将如何获得预期的行为:
Then, how would you do to get the expected behaviour:
>>> 'foo'
'bar'
使用 sys.displayhook
其实很容易:
This is actually quite easy using sys.displayhook
:
在评估表达式sys.displayhook
在交互式Python会话中输入.通过为sys.displayhook
分配另一个单参数函数,可以自定义这些值的显示.
sys.displayhook
is called on the result of evaluating an expression entered in an interactive Python session. The display of these values can be customized by assigning another one-argument function tosys.displayhook
.
这是一个例子:
import sys
old_displayhook = sys.displayhook
def displayhook(object):
if type(object) is str:
old_displayhook('bar')
else:
old_displayhook(object)
sys.displayhook = displayhook
然后...(!)
'foo'
#>>> 'bar'
123
#>>> 123
关于为什么 repr
如此缓存的哲学观点,首先考虑:
On the philosophical point of why repr
would be cached as so, first consider:
1 + 1
如果这必须在调用之前在字典中查找__add__
会很痛苦,因为CPython本身就很慢,因此CPython决定将查找缓存到标准dunder(双下划线)方法中. __repr__
是其中之一,即使需要优化查找的情况不太常见.这对于保持快速格式化('%s'%s
)还是很有用的.
It would be a pain if this had to look-up __add__
in a dictionary before calling, CPython is slow as it is, so CPython decided to cache lookups to standard dunder (double underscore) methods. __repr__
is one of those, even if it is less common to need the lookup optimized. This is still useful to keep formatting ('%s'%s
) fast.
这篇关于Python重载原语的文章就介绍到这了,希望我们推荐的答案对大家有所帮助,也希望大家多多支持IT屋!