为什么我的Python / numpy的例子比纯C语言实现更快? [英] Why is my python/numpy example faster than pure C implementation?

查看:561
本文介绍了为什么我的Python / numpy的例子比纯C语言实现更快?的处理方法,对大家解决问题具有一定的参考价值,需要的朋友们下面随着小编来一起学习吧!

问题描述

我有pretty大同小异code在Python和C Python的例子:

I have pretty much the same code in python and C. Python example:

import numpy
nbr_values = 8192
n_iter = 100000

a = numpy.ones(nbr_values).astype(numpy.float32)
for i in range(n_iter):
    a = numpy.sin(a)

C的例子:

#include <stdio.h>
#include <math.h>
int main(void)
{
  int i, j;
  int nbr_values = 8192;
  int n_iter = 100000;
  double x;  
  for (j = 0; j < nbr_values; j++){
    x = 1;
    for (i=0; i<n_iter; i++)
    x = sin(x);
  }
  return 0;
}

奇怪的事情发生时,我跑了两个例子:

Something strange happen when I ran both examples:

$ time python numpy_test.py 
real    0m5.967s
user    0m5.932s
sys     0m0.012s

$ g++ sin.c
$ time ./a.out 
real    0m13.371s
user    0m13.301s
sys     0m0.008s

它看起来像蟒蛇/ numpy的两次比C快是否有上述实验中的任何错误?您如何解释呢?

It looks like python/numpy is twice faster than C. Is there any mistake in the experiment above? How you can explain it?

P.S。我的Ubuntu 12.04,8G内存,酷睿i5 BTW

P.S. I have Ubuntu 12.04, 8G ram, core i5 btw

推荐答案

首先,打开优化。其次,细微之处关系。你的C code是绝对不是'基本上是相同的。

First, turn on optimization. Secondly, subtleties matter. Your C code is definitely not 'basically the same'.

下面是等价的C code:

Here is equivalent C code:

sinary2.c:

sinary2.c:

#include <math.h>
#include <stdlib.h>

float *sin_array(const float *input, size_t elements)
{
    int i = 0;
    float *output = malloc(sizeof(float) * elements);
    for (i = 0; i < elements; ++i) {
        output[i] = sin(input[i]);
    }
    return output;
}

sinary.c:

sinary.c:

#include <math.h>
#include <stdlib.h>

extern float *sin_array(const float *input, size_t elements)

int main(void)
{
    int i;
    int nbr_values = 8192;
    int n_iter = 100000;
    float *x = malloc(sizeof(float) * nbr_values);  
    for (i = 0; i < nbr_values; ++i) {
        x[i] = 1;
    }
    for (i=0; i<n_iter; i++) {
        float *newary = sin_array(x, nbr_values);
        free(x);
        x = newary;
    }
    return 0;
}

结果:

$ time python foo.py 

real    0m5.986s
user    0m5.783s
sys 0m0.050s
$ gcc -O3 -ffast-math sinary.c sinary2.c -lm
$ time ./a.out 

real    0m5.204s
user    0m4.995s
sys 0m0.208s

程序有两种被拆分的原因是为了愚弄优化了一下。否则会意识到,整个环具有完全没有影响,并优化它。在两个文件中把东西不给编译器的可视性 sin_array 时,它的编纂的可能产生的副作用因此有假设它实际上有一定的反复调用它。

The reason the program has to be split in two is to fool the optimizer a bit. Otherwise it will realize that the whole loop has no effect at all and optimize it out. Putting things in two files doesn't give the compiler visibility into the possible side-effects of sin_array when it's compiling main and so it has to assume that it actually has some and repeatedly call it.

您原来的程序是不是有几个原因都相同。其中之一是,你已经嵌套在C版本的循环和你在Python没有。另一个原因是,您使用的Python版本,而不是在C版本值数组工作。另一个原因是,你正在创建并在Python版本丢弃的阵列,而不是在C版本。最后是你正在使用的Python版本浮动双击在C版本。

Your original program is not at all equivalent for several reasons. One is that you have nested loops in the C version and you don't in Python. Another is that you are working with arrays of values in the Python version and not in the C version. Another is that you are creating and discarding arrays in the Python version and not in the C version. And lastly you are using float in the Python version and double in the C version.

简单地调用函数的适当次数不会使同等测试。

Simply calling the sin function the appropriate number of times does not make for an equivalent test.

另外,优化是一个非常重要的事为C.比较C code对,当你想知道关于速度对比优化还没有被用于其他任何东西是错误的做法。当然,你还需要铭记。的C优化是非常复杂的,如果你要测试的东西,真的没有做任何事情时,C优化器很可能会注意到这个事实,根本就没有做任何事情,导致程序是可笑快。

Also, the optimizer is a really big deal for C. Comparing C code on which the optimizer hasn't been used to anything else when you're wondering about a speed comparison is the wrong thing to do. Of course, you also need to be mindful. The C optimizer is very sophisticated and if you're testing something that really doesn't do anything, the C optimizer might well notice this fact and simply not do anything at all, resulting in a program that's ridiculously fast.

这篇关于为什么我的Python / numpy的例子比纯C语言实现更快?的文章就介绍到这了,希望我们推荐的答案对大家有所帮助,也希望大家多多支持IT屋!

查看全文
登录 关闭
扫码关注1秒登录
发送“验证码”获取 | 15天全站免登陆