为什么cython嵌入式插件在cpython解释器中具有比rust-c接口版本更高的性能? [英] Why cython embeded plugins has higher performance in cpython interpreter than rust-c interface versions?

查看:70
本文介绍了为什么cython嵌入式插件在cpython解释器中具有比rust-c接口版本更高的性能?的处理方法,对大家解决问题具有一定的参考价值,需要的朋友们下面随着小编来一起学习吧!

问题描述

我想问一些有关python解释器基本原理的问题,因为在我自己的搜索过程中没有得到很多有用的信息.

I would like to ask some questions about the underlying principles of python interpreters, because I didn't get much useful information during my own search.

我最近一直在使用rust编写python插件,这大大提高了python的cpu密集型任务的速度,并且与c相比,编写速度也更快.但是,它的一个缺点是,与使用cython加速的旧方案相比,rust(我正在使用pyo3)的调用开销似乎比c(我正在使用cython)的调用开销大​​,

I've been using rust to write python plugins lately, this gives a significant speedup to python's cpu-intensive tasks, and it's also faster to write comparing to c. However it has one disadvantage is that, compared to the old scheme of using cython to accelerate, the call overhead of rust (I'm using pyo3) seems to be greater than that of c(I'm using cython),

例如,我们在这里有一个空的python函数:

For example , we got an empty python function here:

def empty_function():
    return 0

通过for循环在Python中调用它一百万次,并计算时间,以便我们可以发现每个调用大约需要70纳秒(在我的电脑中).

Call it a million times over in Python via a for loop and count the time, so that we can find out each single call takes about 70 nanosecond(in my pc).

如果我们将其编译为具有相同源代码的cython插件:

And if we compile it to a cython plugin, with the same source code:

# test.pyx
cpdef unsigned int empty_function():
    return 0

执行时间将减少到40纳秒.这意味着我们可以使用cython进行一些细粒度的嵌入,并且可以期望它总是比本地python更快地执行.

The execution time will be reduced to 40 nanoseconds. Which means that we can use cython for some fine-grained embedding, and we can expect it to always execute faster than native python.

但是,关于Rust,(诚实地说,我更喜欢使用rust进行插件开发,而不是使用cython,因为现在不需要在语法上进行一些怪异的修改了),通话时间将增加到140纳秒,几乎是两倍和原生python一样多.源代码如下:

However when it comes to Rust, (Honesty speaking, I prefer to use rust for plugin development rather than cython now cause there's no need to do some weird hacking in grammar), the call time will increase to 140 nanoseconds, almost twice as much as native python. Source code as follow:

use pyo3::prelude::*;
use pyo3::wrap_pyfunction;

#[pyfunction]
fn empty_function() -> usize {
    0
}

#[pymodule]
fn testlib(_py: Python, m: &PyModule) -> PyResult<()> {
    m.add_function(wrap_pyfunction!(empty_function, m)?)?;
    Ok(())
}

这意味着rust不适合用于python的细粒度嵌入式替换.如果有一个呼叫时间很少的任务,而每个呼叫都花费很长时间,那么使用rust是完美的选择.但是,如果有一个任务在代码中被调用很多,那么似乎不适合rust,因为类型转换的开销将占用大部分加速时间.

This means that rust is not suitable for fine-grained embedded replacement of python. If there is a task whose call time is very few and each call takes a long time, then it is perfect to use rust. However if there's a task will be called a lot in the code, then it seems not suitable for rust , cause the overhead of type conversion will take up most of the accelerated time.

我想知道这是否可以解决,更重要的是,我想知道这种差异的根本原因.在它们之间调用时,cpython解释器是否存在某种区别,例如在调用c插件时cpython和pypy之间的区别?在哪里可以获得更多信息?谢谢.

I want to know if this can be solved and, more importantly, I want to know the underlying rationale for this discrepancy. Is there some kind of difference with the cpython interpreter when calling between them, like the difference between cpython and pypy when calling c plugins? Where can I get further information? Thanks.

===

更新:

对不起,我没想到我的问题会模棱两可,毕竟,已经给出了这三个代码的源代码,并且使用timeit来测试函数运行时在python开发中几乎是惯例.

Sorry guys, I didn't anticipate that my question would be ambiguous, after all, the source code for all three has been given, and using timeit to test function runtimes is an almost convention in python development.

我的测试代码与@Jmb的注释几乎完全相同,但有一些细微的区别,我使用的是 python setup.py build_ext --inplace 的构建方式,而不是裸露的方式gcc,但这应该没有任何区别.无论如何,谢谢您的补充.

My test code is nearly all the same with @Jmb 's code in comment, with some subtle differences that I'm using python setup.py build_ext --inplace way to build instead of bare gcc, but that should not make any difference. Anyway, thanks for supplementary.

推荐答案

正如评论中所建议的,这是一个自已的答案.

As suggested in the comments, this is a self-answer.

由于评论部分中的讨论未得出明确的结论,因此我去提出了pyo3的回购中的问题并得到其主要维护者的回应.

Since the discussion in the comments section did not lead to a clear conclusion, I went to raise an issue in pyo3's repo and get response from whose main maintainer.

简而言之,结论是当cpython调用pyo3或cython编译的插件之间没有根本区别.当前的速度差异来自于不同的优化深度.

In short, the conclusion is that there is no fundamental difference between the plugins compiled by pyo3 or cython when cpython calling them. The current speed difference comes from the different depth of optimization.

以下是问题的链接: https://github.com/PyO3/pyo3/issues/1470

这篇关于为什么cython嵌入式插件在cpython解释器中具有比rust-c接口版本更高的性能?的文章就介绍到这了,希望我们推荐的答案对大家有所帮助,也希望大家多多支持IT屋!

查看全文
登录 关闭
扫码关注1秒登录
发送“验证码”获取 | 15天全站免登陆