cpdef和包装在def中的cdef之间有什么区别? [英] What are the differences between a cpdef and a cdef wrapped in a def?

查看:66
本文介绍了cpdef和包装在def中的cdef之间有什么区别?的处理方法,对大家解决问题具有一定的参考价值,需要的朋友们下面随着小编来一起学习吧!

问题描述

在Cython文档中,有一个示例,提供两种编写C / Python混合方法的方法。一个明确的对象,带有一个cdef用于快速C访问,而一个包装def用于从Python访问:

In the Cython docs there is an example where they give two ways of writing a C/Python hybrid method. An explicit one with a cdef for fast C access and a wrapper def for access from Python:

cdef class Rectangle:
    cdef int x0, y0
    cdef int x1, y1
    def __init__(self, int x0, int y0, int x1, int y1):
        self.x0 = x0; self.y0 = y0; self.x1 = x1; self.y1 = y1
    cdef int _area(self):
        cdef int area
        area = (self.x1 - self.x0) * (self.y1 - self.y0)
        if area < 0:
            area = -area
        return area
    def area(self):
        return self._area()

然后使用cpdef:

cdef class Rectangle:
    cdef int x0, y0
    cdef int x1, y1
    def __init__(self, int x0, int y0, int x1, int y1):
        self.x0 = x0; self.y0 = y0; self.x1 = x1; self.y1 = y1
    cpdef int area(self):
        cdef int area
        area = (self.x1 - self.x0) * (self.y1 - self.y0)
        if area < 0:
            area = -area
        return area

我想知道

例如,从C / Python调用时,方法是更快还是更慢?

For example, is either method faster/slower when called from C/Python?

此外,当子类化/重写时,cpdef是否提供其他方法所缺少的东西?

Also, when subclassing/overriding does cpdef offer anything that the other method lacks?

推荐答案

chrisb的答案为您提供了所需的一切要知道,但是如果您想获取更多细节...

chrisb's answer gives you all you need to know, but if you are game for gory details...

但是首先,从冗长的分析中获得的收获简而言之:

But first, the takeaways from the lengthy analysis bellow in a nutshell:


  • 对于免费功能, cpdef 与使用<$ c推出之间没有太大区别$ c> cdef + def 性能。产生的C代码几乎相同。

  • For free functions, there is not much difference between cpdef and rolling it out with cdef+def performance-wise. The resulting c-code is almost identical.

对于绑定方法, cpdef -方法可能会略有不同。

For bound methods, cpdef-approach can be slightly faster in the presence of inheritance-hierarchies, but nothing to get too excited about.

使用 cpdef 可以使继承层次结构的运行速度更快。 -syntax有其优点,因为生成的代码更清晰(至少对我而言)且更短。

Using cpdef-syntax has its advantages, as the resulting code is clearer (at least to me) and shorter.

免费功能:

当我们定义一些愚蠢的东西时:

When we define something silly like:

 cpdef do_nothing_cp():
   pass

发生以下情况:


  1. 创建了一个快速的c函数(在这种情况下,其隐含名称为 __ pyx_f_3foo_do_nothing_cp ,因为我的扩展名为 foo ,但实际上您只需要查找 f 前缀。)

  2. 还创建了一个python函数(称为 __ pyx_pf_3foo_2do_nothing_cp -前缀 pf ),它不会重复连接代码并在途中的某个地方调用快速函数。

  3. 创建了一个名为 __ pyx_pw_3foo_3do_nothing_cp 的python-wrapper(前缀 pw

  4. do_nothing_cp 方法定义已发布,这就是python-wrapper的内容是必需的,而这是存储 foo.do_nothing_cp 时应调用哪个函数的地方。

  1. a fast c-function is created (in this case it has a cryptic name __pyx_f_3foo_do_nothing_cp because my extension is called foo, but you actually have only to look for the f prefix).
  2. a python-function is also created (called __pyx_pf_3foo_2do_nothing_cp - prefix pf), it does not duplicate the code and call the fast function somewhere on the way.
  3. a python-wrapper is created, called __pyx_pw_3foo_3do_nothing_cp (prefix pw)
  4. do_nothing_cp method definition is issued, this is what the python-wrapper is needed for, and this is the place where is stored which function should be called when foo.do_nothing_cp is invoked.

您可以在生成的C代码中查看它:

You can see it in the produced c-code here:

 static PyMethodDef __pyx_methods[] = {
  {"do_nothing_cp", (PyCFunction)__pyx_pw_3foo_3do_nothing_cp, METH_NOARGS, 0},
  {0, 0, 0, 0}
};

对于 cdef 函数,仅第一个步骤发生,对于 def 函数仅执行步骤2-4。

For a cdef function, only the first step happens, for a def-function only steps 2-4.

现在,当我们加载模块<$ c时$ c> foo 并调用 foo.do_nothing_cp()会发生以下情况:

Now when we load module foo and invoke foo.do_nothing_cp() the following happens:


  1. 找到绑定到名称 do_nothing_cp 的函数指针,在本例中为python-wrapper pw -功能。

  2. pw -函数通过功能指针调用,并调用 pf 函数(作为C函数)

  3. pf 函数调用快速的 f 函数。

  1. The function pointer bound to name do_nothing_cp is found, in our case the python-wrapper pw-function.
  2. pw-function is called via function-pointer, and calls the pf-function (as C-functionality)
  3. pf-function calls the fast f-function.

如果我们调用 do_nothing_cp 在cython模块内部?

What happens if we call do_nothing_cp inside the cython-module?

def call_do_nothing_cp():
    do_nothing_cp()

很明显,在这种情况下,cython不需要python机制来定位函数-它可以直接使用快速的 f -函数v通过c函数调用,绕过 pw pf 函数。

Clearly, cython doesn't need the python machinery to locate the function in this case - it can directly use the fast f-function via a c-function call, bypassing pw and pf functions.

如果将 cdef 函数包装在 def 函数中会发生什么?

What happens if we wrap cdef function in a def-function?

cdef _do_nothing():
   pass

def do_nothing():
  _do_nothing()

Cython执行以下操作:

Cython does the following:


  1. 创建了快速的 _do_nothing -函数,与上面的 f -函数相对应。

  2. a pf 函数用于创建 do_nothing 的函数,该函数调用 _do_nothing

  3. 创建了一个python-wrapper,即 pw 函数,该函数将 pf -功能

  4. 该功能通过功能-绑定到 foo.do_nothing 指向python-wrapper pw -函数的指针。

  1. a fast _do_nothing-function is created, corresponding to the f- function above.
  2. a pf-function for do_nothing is created, which calls _do_nothing somewhere on the way.
  3. a python-wrapper, i.e. pw function is created which wraps the pf-function
  4. the functionality is bound to foo.do_nothing via function-pointer to the python-wrapper pw-function.

如您所见-与 cpdef 方法没有太大区别。

As you can see - not much difference to the cpdef-approach.

cdef 函数只是简单的c函数,但是 def cpdef 函数是第一类的python函数-您可以执行以下操作:

The cdef-functions are just simple c-function, but def and cpdef function are python-function of the first class - you could do something like this:

foo.do_nothing=foo.do_nothing_cp

关于性能,我们不能期望在这里有太大差异:

As to performance, we cannot expect much difference here:

>>> import foo
>>> %timeit foo.do_nothing_cp
51.6 ns ± 0.437 ns per loop (mean ± std. dev. of 7 runs, 10000000 loops each)

>>> %timeit foo.do_nothing
51.8 ns ± 0.369 ns per loop (mean ± std. dev. of 7 runs, 10000000 loops each)

如果查看生成的机器代码( objdump -d foo.so ),我们可以看到C编译器已内联cpdef版本的所有调用 do_nothing_cp

If we look at the resulting machine code (objdump -d foo.so), we can see that the C-compiler has inlined all calls for the cpdef-version do_nothing_cp:

 0000000000001340 <__pyx_pw_3foo_3do_nothing_cp>:
    1340:   48 8b 05 91 1c 20 00    mov    0x201c91(%rip),%rax      
    1347:   48 83 00 01             addq   $0x1,(%rax)
    134b:   c3                      retq   
    134c:   0f 1f 40 00             nopl   0x0(%rax)

但不用于滚动 no_thing (我必须承认,我有点惊讶,不了解原因):

but not for the rolled out do_nothing (I must confess, I'm a little bit surprised and don't understand the reasons yet):

0000000000001380 <__pyx_pw_3foo_1do_nothing>:
    1380:   53                      push   %rbx
    1381:   48 8b 1d 50 1c 20 00    mov    0x201c50(%rip),%rbx        # 202fd8 <_DYNAMIC+0x208>
    1388:   48 8b 13                mov    (%rbx),%rdx
    138b:   48 85 d2                test   %rdx,%rdx
    138e:   75 0d                   jne    139d <__pyx_pw_3foo_1do_nothing+0x1d>
    1390:   48 8b 43 08             mov    0x8(%rbx),%rax
    1394:   48 89 df                mov    %rbx,%rdi
    1397:   ff 50 30                callq  *0x30(%rax)
    139a:   48 8b 13                mov    (%rbx),%rdx
    139d:   48 83 c2 01             add    $0x1,%rdx
    13a1:   48 89 d8                mov    %rbx,%rax
    13a4:   48 89 13                mov    %rdx,(%rbx)
    13a7:   5b                      pop    %rbx
    13a8:   c3                      retq   
    13a9:   0f 1f 80 00 00 00 00    nopl   0x0(%rax)

这可以解释为什么 cpdef 版本稍快一些,但与python函数调用的开销相比,没什么区别。

This could explain, why cpdef version is slightly faster, but anyway the difference is nothing compared to the overhead of a python-function-call.

类方法:

由于可能的多态性,这种情况对于类方法来说要复杂一些。让我们开始:

The situation is a little bit more complicated for class methods, because of the possible polymorphism. Let's start out with:

cdef class A:
   cpdef do_nothing_cp(self):
       pass

乍一看,与上面的情况差别不大:

At first sight, there is not that much difference to the case above:


  1. 发出函数的快速,仅c, f -prefix-version

  2. 发出python版本(前缀 pf ),该版本调用 f 函数

  3. python包装器(前缀 pw )包装 pf -version,用于

  4. do_nothing_cp 通过 A 类注册为方法。 PyTypeObject 的code> tp_methods 指针。

  1. A fast, c-only, f-prefix-version of the function is emitted
  2. A python (prefix pf) version is emitted, which calls the f-function
  3. A python wrapper (prefix pw) wraps the pf-version and is used for registration.
  4. do_nothing_cp is registered as a method of class A via tp_methods-pointer of the PyTypeObject.

在生成的c文件中可以看到:

As can be seen in the produced c-file:

static PyMethodDef __pyx_methods_3foo_A[] = {
      {"do_nothing", (PyCFunction)__pyx_pw_3foo_1A_1do_nothing_cp, METH_NOARGS, 0},
      ...
      {0, 0, 0, 0}
    }; 
.... 
static PyTypeObject __pyx_type_3foo_A = {
 ...
  __pyx_methods_3foo_A, /*tp_methods*/
 ...
};

很显然,绑定版本必须具有隐式参数 self 作为附加参数-但还有更多内容: f 函数执行功能分配(如果未从相应的中调用) pf 函数,此分派如下所示(我仅保留重要部分):

Clearly, the bound version has to have the implicit parameter self as an additional argument - but there is more to it: The f-function performs a function-dispatch if called not from the corresponding pf function, this dispatch looks as follows (I keep only the important parts):

static PyObject *__pyx_f_3foo_1A_do_nothing_cp(CYTHON_UNUSED struct __pyx_obj_3foo_A *__pyx_v_self, int __pyx_skip_dispatch) {

  if (unlikely(__pyx_skip_dispatch)) ;//__pyx_skip_dispatch=1 if called from pf-version
  /* Check if overridden in Python */
  else if (look-up if function is overriden in __dict__ of the object)
     use the overriden function
  }
  do the work.

为什么需要它?考虑以下扩展名 foo

Why is it needed? Consider the following extension foo:

cdef class A:
  cpdef do_nothing_cp(self):
   pass

cdef class B(A):
  cpdef call_do_nothing(self):
    self.do_nothing()

当我们调用 B()。call_do_nothing()时会发生什么

What happens when we call B().call_do_nothing()?


  1. 'B-pw-call_do_nothing'被定位并被调用。

  2. 它调用 B-pf-call_do_nothing

  3. 其调用 Bf-call_do_nothing

  4. 调用 Af-do_nothing_cp ,绕过 pw pf -versions。

  1. `B-pw-call_do_nothing' is located and called.
  2. it calls B-pf-call_do_nothing,
  3. which calls B-f-call_do_nothing,
  4. which calls A-f-do_nothing_cp, bypassing pw and pf-versions.

添加以下类 C会发生什么覆盖 do_nothing_cp 函数?

import foo
def class C(foo.B):
    def do_nothing_cp(self):
        print("I do something!")

现在调用 C()。call_do_nothing() le广告到:

Now calling C().call_do_nothing() leads to:


  1. C 的call_do_nothing'-类被定位并被调用,这意味着 B 类的 pw-call_do_nothing'被定位并被调用

  2. 调用 B-pf-call_do_nothing

  3. 调用 Bf-call_do_nothing

  4. 调用 Af-do_nothing (我们已经知道!),绕过 pw pf 版本。

  1. call_do_nothing' of theC-class being located and called which means,pw-call_do_nothing' of the B-class being located and called,
  2. which calls B-pf-call_do_nothing,
  3. which calls B-f-call_do_nothing,
  4. which calls A-f-do_nothing (as we already know!), bypassing pw and pf-versions.

现在在第4步中,我们需要在 Af-do_nothing()中调度调用,以便获得正确的 C.do_nothing()致电!幸运的是,我们手头的函数中有了这个调度!

And now in the 4. step, we need to dispatch the call in A-f-do_nothing() in order to get the right C.do_nothing() call! Luckily we have this dispatch in the function at hand!

要使其更加复杂:如果类 C 会怎样?也是 cdef 类吗?通过 __ dict __ 进行分派将不起作用,因为cdef类没有 __ dict __

To make it more complicated: what if the class C were also a cdef-class? The dispatch via __dict__ would not work, because cdef-classes don't have __dict__?

对于cdef类,多态性的实现类似于C ++的虚拟表,因此在 B.call_do_nothing()中, f-do_nothing -函数不是直接调用而是通过指针进行调用,这取决于对象的类(可以看到在<$ c $中设置的那些虚拟表 c> __ pyx_pymod_exec_XXX ,例如 __ pyx_vtable_3foo_B .__ pyx_base )。因此,在纯cdef层次结构的情况下,不需要在 Af-do_nothing()函数中的 __ dict __ 分派。

For the cdef-classes, the polymorphism is implemented similar to C++'s "virtual tables", so in B.call_do_nothing() the f-do_nothing-function is not called directly but via a pointer, which depends on the class of the object (one can see those "virtual tables" being set up in __pyx_pymod_exec_XXX, e.g. __pyx_vtable_3foo_B.__pyx_base). Thus the __dict__-dispatch in A-f-do_nothing()-function is not needed in case of pure cdef-hierarchy.

关于性能,将 cpdef 与<$ c $进行比较c> cdef + def 我得到:

As to performance, comparing cpdef with cdef+def I get:

                          cpdef         def+cdef
 A.do_nothing              107ns         108ns 
 B.call_nothing            109ns         116ns

因此,如果有人使用 cpdef 稍微快一点,那么差别不会太大。

so the difference isn't that large with, if someone, cpdef being slightly faster.

这篇关于cpdef和包装在def中的cdef之间有什么区别?的文章就介绍到这了,希望我们推荐的答案对大家有所帮助,也希望大家多多支持IT屋!

查看全文
登录 关闭
扫码关注1秒登录
发送“验证码”获取 | 15天全站免登陆