为什么PyCXX以它的方式处理新式类? [英] Why does PyCXX handle new-style classes in the way it does?

查看:178
本文介绍了为什么PyCXX以它的方式处理新式类?的处理方法,对大家解决问题具有一定的参考价值,需要的朋友们下面随着小编来一起学习吧!

问题描述

我选择了一些C ++ Python包装器代码,允许消费者从C ++中构造自定义旧样式和新样式的Python类。

I'm picking apart some C++ Python wrapper code that allows the consumer to construct custom old style and new style Python classes from C++.

原始代码来自 PyCXX ,包含旧样式类和新样式类此处这里。我已经重写了代码实质上,在这个问题,我将引用我自己的代码,因为它允许我以最清晰的方式呈现我的情况。我认为很少有人能够理解原来的代码,没有几天的审查...对我来说,它花了几个星期,我还不清楚。

The original code comes from PyCXX, with old and new style classes here and here. I have however rewritten the code substantially, and in this question I will reference my own code, as it allows me to present the situation in the greatest clarity that I am able. I think there would be very few individuals capable of understanding the original code without several days of scrutiny... For me it has taken weeks and I'm still not clear on it.

旧样式仅来源于PyObject,

The old style simply derives from PyObject,

template<typename FinalClass>
class ExtObj_old : public ExtObjBase<FinalClass>
   // ^ which : ExtObjBase_noTemplate : PyObject    
{
public:
    // forwarding function to mitigate awkwardness retrieving static method 
    // from base type that is incomplete due to templating
    static TypeObject& typeobject() { return ExtObjBase<FinalClass>::typeobject(); }

    static void one_time_setup()
    {
        typeobject().set_tp_dealloc( [](PyObject* t) { delete (FinalClass*)(t); } );

        typeobject().supportGetattr(); // every object must support getattr

        FinalClass::setup();

        typeobject().readyType();
    }

    // every object needs getattr implemented to support methods
    Object getattr( const char* name ) override { return getattr_methods(name); }
    // ^ MARKER1

protected:
    explicit ExtObj_old()
    {
        PyObject_Init( this, typeobject().type_object() ); // MARKER2
    }



当调用one_time_setup()时,类 typeobject())为此新类型创建关联的 PyTypeObject

When one_time_setup() is called, it forces (by accessing base class typeobject()) creation of the associated PyTypeObject for this new type.

后来当构造一个实例时,它使用 PyObject_Init

Later when an instance is constructed, it uses PyObject_Init

到目前为止这么好。

但是新风格类使用更复杂的机器。我怀疑这是与新样式类允许派生的事实有关。

But the new style class uses much more complicated machinery. I suspect this is related to the fact that new style classes allow derivation.

这是我的问题,为什么新的样式类处理实现这是?为什么它必须创建这个额外的PythonClassInstance结构?为什么不能像旧式类处理那样做事情?即Just类型转换从PyObject基类型?并且看到它不这样做,这是否意味着它没有使用它的PyObject基本类型?

这是一个巨大的问题,我会继续修改的职位,直到我很满意它代表了问题好。它不是一个很适合SO的格式,我很抱歉。但是,一些世界级工程师经常访问这个网站(例如,我之前的一个问题是由GCC的主要开发者回答的),我珍惜有机会吸引他们的专业知识。

This is a huge question, and I will keep amending the post until I'm satisfied it represents the issue well. It isn't a good fit for SO's format, I'm sorry about that. However, some world-class engineers frequent this site (one of my previous questions was answered by the lead developer of GCC for example), and I value the opportunity to appeal to their expertise. So please don't be too hasty to vote to close.

新风格类的一次性设置如下:

The new style class's one-time setup looks like this:

template<typename FinalClass>
class ExtObj_new : public ExtObjBase<FinalClass>
{
private:
    PythonClassInstance* m_class_instance;
public:
    static void one_time_setup()
    {
        TypeObject& typeobject{ ExtObjBase<FinalClass>::typeobject() };

        // these three functions are listed below
        typeobject.set_tp_new(      extension_object_new );
        typeobject.set_tp_init(     extension_object_init );
        typeobject.set_tp_dealloc(  extension_object_deallocator );

        // this should be named supportInheritance, or supportUseAsBaseType
        // old style class does not allow this
        typeobject.supportClass(); // does: table->tp_flags |= Py_TPFLAGS_BASETYPE

        typeobject.supportGetattro(); // always support get and set attr
        typeobject.supportSetattro();

        FinalClass::setup();

        // add our methods to the extension type's method table
        { ... typeobject.set_methods( /* ... */); }

        typeobject.readyType();
    }

protected:
    explicit ExtObj_new( PythonClassInstance* self, Object& args, Object& kwds )
      : m_class_instance{self}
    { }

所以新的风格使用了一个自定义的PythonClassInstance结构:

So the new style uses a custom PythonClassInstance structure:

struct PythonClassInstance
{
    PyObject_HEAD
    ExtObjBase_noTemplate* m_pycxx_object;
}



如果我深入Python的object.h,PyObject_HEAD只是一个宏 PyObject ob_base; - 没有进一步的并发症,如#if #else。所以我不明白为什么它不能简单地是:

PyObject_HEAD, if I dig into Python's object.h, is just a macro for PyObject ob_base; -- no further complications, like #if #else. So I don't see why it can't simply be:

struct PythonClassInstance
{
    PyObject ob_base;
    ExtObjBase_noTemplate* m_pycxx_object;
}

或甚至:

struct PythonClassInstance : PyObject
{
    ExtObjBase_noTemplate* m_pycxx_object;
}

无论如何,它的目的是将一个指针标记到一个PyObject。这将是因为Python运行时经常会触发我们放在其函数表中的函数,第一个参数将是负责调用的PyObject。因此,这允许我们检索关联的C ++对象。

Anyway, it seems that its purpose is to tag a pointer onto the end of a PyObject. This will be because Python runtime will often trigger functions we have placed in its function table, and the first parameter will be the PyObject responsible for the call. So this allows us to retrieve the associated C++ object.

但是我们还需要为旧类做这个。

But we also need to do that for the old-style class.

这是负责这样做的函数:

Here is the function responsible for doing that:

ExtObjBase_noTemplate* getExtObjBase( PyObject* pyob )
{
    if( pyob->ob_type->tp_flags & Py_TPFLAGS_BASETYPE )
    {
        /* 
        New style class uses a PythonClassInstance to tag on an additional 
           pointer onto the end of the PyObject
        The old style class just seems to typecast the pointer back up
           to ExtObjBase_noTemplate

        ExtObjBase_noTemplate does indeed derive from PyObject
        So it should be possible to perform this typecast
        Which begs the question, why on earth does the new style class feel 
          the need to do something different?
        This looks like a really nice way to solve the problem
        */
        PythonClassInstance* instance = reinterpret_cast<PythonClassInstance*>(pyob);
        return instance->m_pycxx_object;
    }
    else
        return static_cast<ExtObjBase_noTemplate*>( pyob );
}

我的评论说明我的困惑。

My comment articulates my confusion.

在这里,为了完整性,我们在PyTypeObject的函数指针表中插入一个lambda-trampoline,以便Python运行时可以触发它:

And here, for completeness is us inserting a lambda-trampoline into the PyTypeObject's function pointer table, so that Python runtime can trigger it:

table->tp_setattro = [] (PyObject* self, PyObject* name, PyObject* val) -> int
{
   try {
        ExtObjBase_noTemplate* p = getExtObjBase( self );

        return ( p -> setattro(Object{name}, Object{val}) ); 
    }
    catch( Py::Exception& ) { /* indicate error */
        return -1;
    }
};

(在此演示中我使用tp_setattro,注意还有大约30个其他插槽,你可以看到如果你看看文档的PyTypeObject)

(事实上,工作的主要原因是, try {} catch {}每个蹦床,这使消费者不必编写重复的错误陷阱。)

(in fact the major reason for working this way is that we can try{}catch{} around every trampoline. This saves the consumer from having to code repetitive error trapping.)

类型为关联的C ++对象,并调用其虚拟setattro(仅使用setattro作为示例)。一个派生类将覆盖setattro,并且这个覆盖将被调用。

So, we pull out the "base type for the associated C++ object" and call its virtual setattro (just using setattro as an example here). A derived class will have overridden setattro, and this override will get called.

旧式类提供了这样的覆盖,我已经标记为MARKER1 - 它是

The old-style class provides such an override, which I've labelled MARKER1 -- it is in the top listing for this question.

我可以想到的唯一的事情是,也许不同的维护者使用了不同的技术。但是有没有一些更令人信服的理由为什么旧的和新的风格类需要不同的架构?

The only the thing I can think of is that maybe different maintainers have used different techniques. But is there some more compelling reason why old and new style classes require different architecture?

以下方法来自新样式类:

PS for reference, I should include the following methods from new style class:

    static PyObject* extension_object_new( PyTypeObject* subtype, PyObject* args, PyObject* kwds )
    {
        PyObject* pyob = subtype->tp_alloc(subtype,0);
        PythonClassInstance* o = reinterpret_cast<PythonClassInstance *>( pyob );
        o->m_pycxx_object = nullptr;
        return pyob;
    }

^对我来说,这看起来绝对错了。
它似乎是分配内存,重新转换到一些可能超过分配的数量的结构,然后在这个结束处的null。
我很惊讶它没有导致任何崩溃。
在源代码中看不到这4个字节所有的任何指示。

^ to me, this looks absolutely wrong. It appears to be allocating memory, re-casting to some structure that might exceed the amount allocated, and then nulling right at the end of this. I'm surprised it hasn't caused any crashes. I can't see any indication anywhere in the source code that these 4 bytes are owned.

    static int extension_object_init( PyObject* _self, PyObject* _args, PyObject* _kwds )
    {
        try
        {
            Object args{_args};
            Object kwds{_kwds};

            PythonClassInstance* self{ reinterpret_cast<PythonClassInstance*>(_self) };

            if( self->m_pycxx_object )
                self->m_pycxx_object->reinit( args, kwds );
            else
                // NOTE: observe this is where we invoke the constructor, but indirectly (i.e. through final)
                self->m_pycxx_object = new FinalClass{ self, args, kwds };
        }
        catch( Exception & )
        {
            return -1;
        }
        return 0;
    }

^注意,除了默认的



^ note that there is no implementation for reinit, other than the default

virtual void    reinit ( Object& args  , Object& kwds    ) { 
    throw RuntimeError( "Must not call __init__ twice on this class" ); 
}


    static void extension_object_deallocator( PyObject* _self )
    {
        PythonClassInstance* self{ reinterpret_cast< PythonClassInstance* >(_self) };
        delete self->m_pycxx_object;
        _self->ob_type->tp_free( _self );
    }






一个猜测,感谢Yhg1s在IRC频道上的洞察。


I will hazard a guess, thanks to insight from Yhg1s on the IRC channel.

也许是因为当你创建一个新的旧式类时,它会保证它完全重叠PyObject结构。

Maybe it is because when you create a new old-style class, it is guaranteed it will overlap perfectly a PyObject structure.

因此,从PyObject派生是安全的,并将一个指向底层PyObject的指针传递给Python,这是旧式类(MARKER2)

Hence it is safe to derive from PyObject, and pass a pointer to the underlying PyObject into Python, which is what the old-style class does (MARKER2)

另一方面,新样式类创建一个{PyObject + maybe something else}对象。
ie它不会安全做同样的技巧,因为Python运行时会结束写基础类分配的结束(这只是一个PyObject)。

On the other hand, new style class creates a {PyObject + maybe something else} object. i.e. It wouldn't be safe to do the same trick, as Python runtime would end up writing past the end of the base class allocation (which is only a PyObject).

因为这个,我们需要让Python为类分配,并返回一个我们存储的指针。

Because of this, we need to get Python to allocate for the class, and return us a pointer which we store.

因为我们现在不再使用PyObject基类为这个存储,我们不能使用类型转换的方便的技巧检索相关的C ++对象。
这意味着我们需要在额外的sizeof(void *)字节上标记到实际上被分配的PyObject的结尾,并使用它指向我们的关联的C ++对象实例。

Because we are now no longer making use of the PyObject base-class for this storage, we cannot use the convenient trick of typecasting back to retrieve the associated C++ object. Which means that we need to tag on an extra sizeof(void*) bytes to the end of the PyObject that actually does get allocated, and use this to point to our associated C++ object instance.

但是,这里有一些矛盾。

However, there is some contradiction here.

struct PythonClassInstance
{
    PyObject_HEAD
    ExtObjBase_noTemplate* m_pycxx_object;
}

^如果这确实是完成上述的结构,新的样式类实例确实适合一个PyObject,即它不重叠到m_pycxx_object。

^ if this is indeed the structure that accomplishes the above, then it is saying that the new style class instance is indeed fitting exactly over a PyObject, i.e. It is not overlapping into the m_pycxx_object.

如果是这种情况,那么肯定这个整个过程不必要。

And if this is the case, then surely this whole process is unnecessary.

编辑:这里有一些链接,帮助我学习必要的基础工作:

here are some links that are helping me learn the necessary ground work:

a href =http://eli.thegreenplace.net/2012/04/16/python-object-creation-sequence =nofollow> http://eli.thegreenplace.net/2012/04/16/ python-object-creation-sequence

http://realmike.org/blog/2010/07/18/introduction-to-new-style-classes-in-python

< a href =http://stackoverflow.com/questions/4163018/create-an-object-using-pythons-c-api>使用Python的C API创建对象

http://eli.thegreenplace.net/2012/04/16/python-object-creation-sequence
http://realmike.org/blog/2010/07/18/introduction-to-new-style-classes-in-python
Create an object using Python's C API

推荐答案


给我,这看起来绝对错了。它似乎是分配内存,重新转换到一些可能超过分配的数量的结构,然后在这个结束的时候。我惊讶它没有造成任何崩溃。 我在原始程式码中找不到这4个位元组的任何部分

PyCXX分配足够的内存,但它偶尔会这样做。这似乎是PyCXX中的一个错误。

PyCXX does allocate enough memory, but it does so by accident. This appears to be a bug in PyCXX.

Python为对象分配的内存量由第一次调用<$ c $的以下静态成员函数c> PythonClass< T> :

The amount of memory Python allocates for the object is determined by the first call to the following static member function of PythonClass<T>:

static PythonType &behaviors()
{
...
    p = new PythonType( sizeof( T ), 0, default_name );
...
}

PythonType 将python类型对象的 tp_basicsize 设置为 sizeof(T)。这样当Python分配一个对象时,它知道至少要分配 sizeof(T)字节。因为 sizeof(T)变得大于 sizeof(PythonClassInstance) T 源于 PythonExtensionBase PythonClass< T> )。

The constructor of PythonType sets the tp_basicsize of the python type object to sizeof(T). This way when Python allocates an object it knows to allocate at least sizeof(T) bytes. It works because sizeof(T) turns out to be larger that sizeof(PythonClassInstance) (T is derived from PythonClass<T> which derives from PythonExtensionBase, which is large enough).

但是,它错过了点。它实际上应该只分配 sizeof(PythonClassInstance)。这似乎是PyCXX中的一个错误 - 它分配太多,而不是太少的空间存储 PythonClassInstance 对象。

However, it misses the point. It should actually allocate only sizeof(PythonClassInstance) . This appears to be a bug in PyCXX - that it allocates too much, rather than too little space for storing a PythonClassInstance object.


这是我的问题,为什么新的风格类处理以它的方式实现?为什么它必须创建这个额外的PythonClassInstance结构?为什么不能像旧式类处理那样做事情?

And this is my question, why is the new style class handling implemented in the way that it is? Why is it having to create this extra PythonClassInstance structure? Why can't it do things the same way the old-style class handling does?

这里是我的理论为什么新样式类不同于旧样式类在PyCXX。

Here's my theory why new style classes are different from the old style classes in PyCXX.

在引入新样式类的Python 2.2之前,类型对象中没有 tp_init 成员int。相反,你需要写一个工厂函数来构造对象。这是如何 PythonExtension< T> 应该工作 - 工厂函数将Python参数转换为C ++参数,要求Python分配内存,然后调用构造函数使用placement new 。

Before Python 2.2, where new style classes were introduced, there was no tp_init member int the type object. Instead, you needed to write a factory function that would construct the object. This is how PythonExtension<T> is supposed to work - the factory function converts the Python arguments to C++ arguments, asks Python to allocate the memory and then calls the constructor using placement new.

Python 2.2添加了新的样式类和 tp_init 成员。 Python首先创建该对象,然后调用 tp_init 方法。保持旧的方式将要求对象将首先有一个伪构造函数创建一个空对象(例如初始化所有成员为null),然后当 tp_init 被调用,将具有额外的初始化阶段。这使得代码更丑陋。

Python 2.2 added the new style classes and the tp_init member. Python first creates the object and then calls the tp_init method. Keeping the old way would have required that the objects would first have a dummy constructor that creates an "empty" object (e.g. initializes all members to null) and then when tp_init is called, would have had an additional initialization stage. This makes the code uglier.

似乎PyCXX的作者想要避免这种情况。 PyCXX首先创建一个虚拟 PythonClassInstance 对象,然后当调用 tp_init 时,创建实际的 PythonClass< T> 对象使用其构造函数。

It seems that the author of PyCXX wanted to avoid that. PyCXX works by first creating a dummy PythonClassInstance object and then when tp_init is called, creates the actual PythonClass<T> object using its constructor.


... 没有使用其PyObject基类型

这看起来是正确的, PyObject 基类似乎没有在任何地方使用。 PythonExtensionBase 的所有有趣的方法使用虚拟 self()方法,它返回 m_class_instance 并完全忽略 PyObject 基类。

This appears to be correct, the PyObject base class does not seem to be used anywhere. All the interesting methods of PythonExtensionBase use the virtual self() method, which returns m_class_instance and completely ignore the PyObject base class.

我想(只是一个猜测,虽然)是 PythonClass< T> 被添加到现有系统,似乎更容易从 PythonExtensionBase 派生而不是清理代码。

I guess (only a guess, though) is that PythonClass<T> was added to an existing system and it seemed easier to just derive from PythonExtensionBase instead of cleaning up the code.

这篇关于为什么PyCXX以它的方式处理新式类?的文章就介绍到这了,希望我们推荐的答案对大家有所帮助,也希望大家多多支持IT屋!

查看全文
登录 关闭
扫码关注1秒登录
发送“验证码”获取 | 15天全站免登陆