Python重载原语 [英] Python overload primitives

查看:97
本文介绍了Python重载原语的处理方法,对大家解决问题具有一定的参考价值,需要的朋友们下面随着小编来一起学习吧!

问题描述

我正在尝试重载内置字符串的某些方法. 我知道没有真正合法的用例,但是这种行为仍然困扰着我,所以我想对这里发生的事情进行解释:

I'm trying to overload some methods of the string builtin. I know there is no really legitimate use-case for this, but the behavior still bugs me so I would like to get an explanation of what is happening here:

使用Python2和forbiddenfruit模块.

Using Python2, and the forbiddenfruit module.

>>> from forbiddenfruit import curse
>>> curse(str, '__repr__', lambda self:'bar')
>>> 'foo'
'foo'
>>> 'foo'.__repr__()
'bar'

如您所见,__repr__函数已成功重载,但当我们要求表示形式时实际上并未调用.为什么呢?

As you can see, the __repr__ function as been successfully overloaded, but isn't actually called when when we ask for a representation. Why is that?

然后,您将如何获得预期的行为:

Then, how would you do to get the expected behaviour:

>>> 'foo'
'bar'

设置自定义环境没有任何限制,如果需要重建python,就可以了,但是我真的不知道从哪里开始,我仍然希望有一种更简单的方法:)

There is no constraint about setting up a custom environment, if rebuilding python is what it takes, so be it, but I really don't know where to start, and I still hope there is a easier way :)

推荐答案

首先要注意的是,无论forbiddenfruit在做什么,它根本不会影响repr.对于str而言,这不是特殊情况,只是无法正常工作:

The first thing to note is that whatever forbiddenfruit is doing, it's not affecting repr at all. This isn't a special case for str, it just doesn't work like that:

import forbiddenfruit

class X:
    repr = None

repr(X())
#>>> '<X object at 0x7f907acf4c18>'

forbiddenfruit.curse(X, "__repr__", lambda self: "I am X")

repr(X())
#>>> '<X object at 0x7f907acf4c50>'

X().__repr__()
#>>> 'I am X'

X.__repr__ = X.__repr__

repr(X())
#>>> 'I am X'

我最近发现一种更简单的方法来实现forbiddenfruit的作用,谢谢到 HYRY 的帖子:

I recently found a much simpler way of doing what forbiddenfruit does thanks to a post by HYRY:

import gc

underlying_dict = gc.get_referents(str.__dict__)[0]
underlying_dict["__repr__"] = lambda self: print("I am a str!")

"hello".__repr__()
#>>> I am a str!

repr("hello")
#>>> "'hello'"

因此,从某种角度上讲,我们知道发生了其他事情.

So we know, somewhat anticlimactically, that something else is going on.

这是 builtin_repr的来源 :

Here's the source for builtin_repr:

builtin_repr(PyModuleDef *module, PyObject *obj)
/*[clinic end generated code: output=988980120f39e2fa input=a2bca0f38a5a924d]*/
{
    return PyObject_Repr(obj);
}

对于 PyObject_Repr (各节消失):

And for PyObject_Repr (sections elided):

PyObject *
PyObject_Repr(PyObject *v)
{
    PyObject *res;

    res = (*v->ob_type->tp_repr)(v);
    if (res == NULL)
        return NULL;

}

重要的一点是,它不是在dict中查找,而是在"ccached" tp_repr属性中查找.

The important point is that instead of looking up in a dict, it looks up the "cached" tp_repr attribute.

在设置属性,例如TYPE.__repr__ = new_repr:

static int
type_setattro(PyTypeObject *type, PyObject *name, PyObject *value)
{
    if (!(type->tp_flags & Py_TPFLAGS_HEAPTYPE)) {
        PyErr_Format(
            PyExc_TypeError,
            "can't set attributes of built-in/extension type '%s'",
            type->tp_name);
        return -1;
    }
    if (PyObject_GenericSetAttr((PyObject *)type, name, value) < 0)
        return -1;
    return update_slot(type, name);
}

第一部分是阻止您修改内置类型的东西.然后,它以通用方式(PyObject_GenericSetAttr)设置属性,并至关重要地更新插槽.

The first part is the thing preventing you from modifying built-in types. Then it sets the attribute generically (PyObject_GenericSetAttr) and, crucially, updates the slots.

如果您对它的工作方式感兴趣,请它是在此处可用.关键点是:

If you're interested in how that works, it's available here. The crucial points are:

  • 这不是导出函数,并且

  • It's not an exported function and

它修改PyTypeObject实例本身

因此要进行复制,就需要侵入PyTypeObject类型本身.

so replicating it would require hacking into the PyTypeObject type itself.

如果要这样做,可能最容易尝试的方法是(临时?)在str类上设置type->tp_flags & Py_TPFLAGS_HEAPTYPE.这样可以正常设置属性. 当然,我们不能保证这不会使您的解释器崩溃.

除非我确实需要,否则这不是我想要做的(特别是不是通过ctypes),所以我为您提供了一条捷径.

This is not what I want to do (especially not through ctypes) unless I really have to, so I offer you a shortcut.

您写:

然后,您将如何获得预期的行为:

Then, how would you do to get the expected behaviour:

>>> 'foo'
'bar'

使用 sys.displayhook 其实很容易:

This is actually quite easy using sys.displayhook:

在评估表达式 sys.displayhook 在交互式Python会话中输入.通过为sys.displayhook分配另一个单参数函数,可以自定义这些值的显示.

sys.displayhook is called on the result of evaluating an expression entered in an interactive Python session. The display of these values can be customized by assigning another one-argument function to sys.displayhook.

这是一个例子:

import sys

old_displayhook = sys.displayhook
def displayhook(object):
    if type(object) is str:
        old_displayhook('bar')
    else:
        old_displayhook(object)

sys.displayhook = displayhook

然后...(!)

'foo'
#>>> 'bar'

123
#>>> 123


关于为什么 repr如此缓存的哲学观点,首先考虑:


On the philosophical point of why repr would be cached as so, first consider:

1 + 1

如果这必须在调用之前在字典中查找__add__会很痛苦,因为CPython本身就很慢,因此CPython决定将查找缓存到标准dunder(双下划线)方法中. __repr__是其中之一,即使需要优化查找的情况不太常见.这对于保持快速格式化('%s'%s)还是很有用的.

It would be a pain if this had to look-up __add__ in a dictionary before calling, CPython is slow as it is, so CPython decided to cache lookups to standard dunder (double underscore) methods. __repr__ is one of those, even if it is less common to need the lookup optimized. This is still useful to keep formatting ('%s'%s) fast.

这篇关于Python重载原语的文章就介绍到这了,希望我们推荐的答案对大家有所帮助,也希望大家多多支持IT屋!

查看全文
登录 关闭
扫码关注1秒登录
发送“验证码”获取 | 15天全站免登陆