如何在Cython中键入生成器函数? [英] How to type generator function in Cython?

查看:55
本文介绍了如何在Cython中键入生成器函数?的处理方法,对大家解决问题具有一定的参考价值,需要的朋友们下面随着小编来一起学习吧!

问题描述

如果我在Python中有一个生成器函数,请说:

If I have a generator function in Python, say:

def gen(x):
    for i in range(x):
        yield(i ** 2)

如何在Cython中声明输出数据类型为 int ?值得吗?

How do I declare that the output data type is int in Cython? Is it even worth while?

谢谢.

我阅读了在变更日志中实现的(异步)生成器的提法:

I read mentions of (async) generators being implemented in the changelog: http://cython.readthedocs.io/en/latest/src/changes.html?highlight=generators#id23

但是,没有有关如何使用它们的文档.是因为它们受支持,但与Cython一起使用它们并没有特别的优势或无法进行优化?

However there is no documentation about how to use them. Is it because they are supported but there is no particular advantage in using them with Cython or no optimization possible?

推荐答案

不,在Cython中无法做到这一点.

No, there is no way to do this in Cython.

查看Cython生成的代码时,您会看到 gen (以及其他生成器函数)返回了一个生成器,该生成器基本上是一个 __ pyx_CoroutineObject 对象,其中看起来如下:

When you look at the Cython-produced code, you will see that gen (and other generator-functions) returns a generator, which is basically a __pyx_CoroutineObject object, which looks as follows:

typedef PyObject *(*__pyx_coroutine_body_t)(PyObject *, PyThreadState *, PyObject *);
typedef struct {
    PyObject_HEAD
    __pyx_coroutine_body_t body;
    PyObject *closure;
    ...
    int resume_label;
    char is_running;
} __pyx_CoroutineObject;

最重要的部分是 body -member:这是进行实际计算的函数.如我们所见,它返回一个 PyObject ,还没有办法(还可以吗)使其适应 int double 或类似的方法.

The most important part is the body-member: this is the function which does the actual calculation. As we can see it returns a PyObject and there is no way (yet?) for it to be adapted to int, double or similar.

至于为什么不这样做,我只能推测-但可能有不止一个原因.

As for the reasons why it is not done, I can only speculate - but there are probably more than just one reason.

如果您真的在乎性能,生成器无论如何都会引入过多的开销(例如,在 cdef 函数中无法生成 yield ),应该将其重构为更简单的形式.

If you really care about performance, generators introduce too much overhead anyway (for example yield is not possible in cdef-functions) and should be refactored into something simpler.

详细说明可能的重构.作为基准,假设我们想总结所有创建的值:

To elaborate more about possible refactorings. As baseline let's assume we would like to sum up all created values:

%%cython 
def gen(int x):
    cdef int i
    for i in range(x):
        yield(i ** 2)

def sum_it(int n):
    cdef int i
    cdef int res=0
    for i in gen(n):
        res+=i
    return res

定时将导致:

>>> %timeit sum_it(1000)
28.9 µs ± 1.06 µs per loop (mean ± std. dev. of 7 runs, 10000 loops each)

好消息:它比纯python版本快大约10倍,但如果我们真的追赶速度:

The good news: it is about 10 times faster than the pure python version, but if we are really after the speed:

%%cython 
cdef int gen_fast(int i):
    return i ** 2

def sum_it_fast(int n):
    cdef int i
    cdef int res=0
    for i in range(n):
        res+=gen_fast(i)
    return res

是:

>>> %timeit sum_it_fast(1000)
661 ns ± 20.7 ns per loop (mean ± std. dev. of 7 runs, 1000000 loops each)

快50倍.

我知道,这是一个很大的变化,可能很难做到-只有在确实是我程序的瓶颈时,我才这样做-但是加速50才是这样做的真正动力

I understand, that is quite a change and might be pretty hard to do - I would do it only if it is really the bottle-neck of my program - but then speed-up 50 would be a real motivation to do it.

很明显,还有许多其他方法:使用numpy-arrays或 array.array 代替生成器,或者编写自定义生成器(cdef类),这将提供额外的快速/高效的可能性获取 int 值而不是 PyObjects -但这取决于您的实际情况.我只是想表明,有可能通过抛弃发电机来提高性能.

Obviously there are a lot of others approaches: using numpy-arrays or array.array instead of generators or writing a custom generator (cdef-class) which would offer an additional fast/efficient possibility to get the int-values and not PyObjects - but this all depends on your scenario at hand. I just wanted to show that there is potential to improve the performance by ditching the generators.

这篇关于如何在Cython中键入生成器函数?的文章就介绍到这了,希望我们推荐的答案对大家有所帮助,也希望大家多多支持IT屋!

查看全文
登录 关闭
扫码关注1秒登录
发送“验证码”获取 | 15天全站免登陆