为什么我的python numpy代码比c ++快? [英] Why my python numpy code is faster than c++?

查看:132
本文介绍了为什么我的python numpy代码比c ++快?的处理方法,对大家解决问题具有一定的参考价值,需要的朋友们下面随着小编来一起学习吧!

问题描述

有人可以告诉我为什么这是Python Numpy代码吗

Can someone tell my why is this Python Numpy code:

import numpy as np
import time

k_max = 40000
N = 10000

data = np.zeros((2,N))
coefs = np.zeros((k_max,2),dtype=float)

t1 = time.time()
for k in xrange(1,k_max+1):
    cos_k = np.cos(k*data[0,:])
    sin_k = np.sin(k*data[0,:])
    coefs[k-1,0] = (data[1,-1]-data[1,0]) + np.sum(data[1,:-1]*(cos_k[:-1] - cos_k[1:]))
    coefs[k-1,1] = np.sum(data[1,:-1]*(sin_k[:-1] - sin_k[1:]))
t2 = time.time()

print('Time:')
print(t2-t1)

比此C ++代码快:

#include <cstdio>
#include <iostream>
#include <cmath>
#include <time.h>

using namespace std;

// consts
const unsigned int k_max = 40000;
const unsigned int N = 10000;

int main()
{
    time_t start, stop;
    double diff;
    // table with data
    double data1[ N ];
    double data2[ N ];
    // table of results
    double coefs1[ k_max ];
    double coefs2[ k_max ];
    // main loop
    time( & start );
    for( unsigned int j = 1; j<N; j++ )
    {
        for( unsigned int i = 0; i<k_max; i++ )
        {
            coefs1[ i ] += data2[ j-1 ]*(cos((i+1)*data1[ j-1 ]) - cos((i+1)*data1[ j ]));
            coefs2[ i ] += data2[ j-1 ]*(sin((i+1)*data1[ j-1 ]) - sin((i+1)*data1[ j ]));
        }
    }
    // end of main loop
    time( & stop );
    // speed result
    diff = difftime( stop, start );
    cout << "Time: " << diff << " seconds";
    return 0;
}

第一个显示:时间:8秒" 而第二个:时间:11秒"

The first one shows: "Time: 8 seconds" while the second: "Time: 11 seconds"

我知道numpy是用C编写的,但我仍然认为C ++示例会更快.我想念什么吗?有没有一种方法可以改善C ++代码(或python代码)?预先感谢您的帮助!

I know that numpy is written in C but I would still think that C++ example would be faster. Am I missing something? Is there a way to improve the C++ code (or the python one)? Thank you in advance for your help!

我已按照其中一项注释中的建议更改了C ++代码(从动态表到静态表). C ++代码现在更快,但仍然比Python版本慢得多.

I have changed the C++ code (dynamical tables to static tables) as suggested in one of the comments. The C++ code is faster now but still much slower than the Python version.

我已从调试模式更改为发布模式,并将"k"从4000增加到40000.现在numpy的速度稍快(从8秒到11秒).

I have changed from debug to release mode and increased 'k' from 4000 to 40000. Now numpy is just slightly faster (8 seconds to 11 seconds).

推荐答案

我发现这个问题很有趣,因为每次我遇到有关numpy速度(与c/c ++相比)的类似话题时,总会有类似薄包装纸,它的核心是用c编写的,因此是脂肪",但这并不能解释为什么c应该比带有附加层(甚至薄的c)的c慢.

I found this question interesting, because every time I encountered similar topic about the speed of numpy (compared to c/c++) there was always answer like "it's a thin wrapper, its core is written in c, so it's fats", but this doesn't explain why c should be slower than c with additional layer (even a thin one).

答案是:正确编译后,您的c ++代码并不比您的python代码慢.

我已经完成了一些基准测试,起初看来numpy出奇地快.但是我忘记了使用 gcc .

I've done some benchmarks, and at first it seemed that numpy is surprisingly faster. But I forgot about optimizing the compilation with gcc.

我再次计算了所有内容,并将结果与​​纯C版本的代码进行了比较.我正在使用gcc版本4.9.2和python2.7.9(从源代码编译,具有相同的gcc).我使用g++ -O3 main.cpp -o main来编译您的c ++代码,我使用gcc -O3 main.c -lm -o main来编译我的c代码.在所有示例中,我用一些数字(0.1、0.4)填充了data变量,因为它会改变结果.我也将np.arrays更改为使用double(dtype=np.float64),因为在c ++示例中有double.我的纯C版本代码(类似):

I've computed everything again and also compared results with a pure c version of your code. I am using gcc version 4.9.2, and python2.7.9 (compiled from the source with the same gcc). To compile your c++ code I used g++ -O3 main.cpp -o main, to compile my c code I used gcc -O3 main.c -lm -o main. In all examples I filled data variables with some numbers (0.1, 0.4), as it changes results. I also changed np.arrays to use doubles (dtype=np.float64), because there are doubles in c++ example. My pure c version of your code (its similar):

#include <math.h>
#include <stdio.h>
#include <time.h>

const int k_max = 100000;
const int N = 10000;

int main(void)
{
    clock_t t_start, t_end;
    double data1[N], data2[N], coefs1[k_max], coefs2[k_max], seconds;
    int z;
    for( z = 0; z < N; z++ )
    {
        data1[z] = 0.1;
        data2[z] = 0.4;
    }

    int i, j;
    t_start = clock();
    for( i = 0; i < k_max; i++ )
    {
        for( j = 0; j < N-1; j++ )
        {
            coefs1[i] += data2[j] * (cos((i+1) * data1[j]) - cos((i+1) * data1[j+1]));
            coefs2[i] += data2[j] * (sin((i+1) * data1[j]) - sin((i+1) * data1[j+1]));
        }
    }
    t_end = clock();

    seconds = (double)(t_end - t_start) / CLOCKS_PER_SEC;
    printf("Time: %f s\n", seconds);
    return coefs1[0];
}

对于k_max = 100000, N = 10000结果如下:

  • python 70.284362 s
  • c ++ 69.133199 s
  • c 61.638186 s

Python和c ++基本具有相同的时间,但是请注意,存在一个长度为k_max的python循环,与c/c ++相比,它应该慢得多. 是的.

Python and c++ have basically the same time, but note that there is a python loop of length k_max, which should be much slower compared to c/c++ one. And it is.

对于k_max = 1000000, N = 1000,我们有:

  • python 115.42766 s
  • c ++ 70.781380 s

对于k_max = 1000000, N = 100:

  • python 52.86826 s
  • c ++ 7.050597 s

因此,差异随分数k_max/N的增加而增加,但是即使Nk_max大得多,python也不算快. G. k_max = 100, N = 100000:

So the difference increases with fraction k_max/N, but python is not faster even for N much bigger than k_max, e. g. k_max = 100, N = 100000:

  • python 0.651587 s
  • c ++ 0.568518 s

很明显,c/c ++和python之间的主要速度差异是在for循环中.但是我想找出对numpy和c中的数组进行简单操作之间的区别.在代码中使用numpy的优点包括:1.将整个数组乘以一个数字; 2.计算整个数组的正弦/余弦; 3.对数组的所有元素求和,而不是分别对每个项目进行这些操作.因此,我准备了两个脚本来仅比较这些操作.

Obviously, the main speed difference between c/c++ and python is in the for loop. But I wanted to find out the difference between simple operations on arrays in numpy and in c. Advantages of using numpy in your code consists of: 1. multiplying the whole array by a number, 2. calculating sin/cos of the whole array, 3. summing all elements of the array, instead of doing those operations on every single item separately. So I prepared two scripts to compare only these operations.

Python脚本:

import numpy as np
from time import time

N = 10000
x_len = 100000

def main():
    x = np.ones(x_len, dtype=np.float64) * 1.2345

    start = time()
    for i in xrange(N):
        y1 = np.cos(x, dtype=np.float64)
    end = time()
    print('cos: {} s'.format(end-start))

    start = time()
    for i in xrange(N):
        y2 = x * 7.9463
    end = time()
    print('multi: {} s'.format(end-start))

    start = time()
    for i in xrange(N):
        res = np.sum(x, dtype=np.float64)
    end = time()
    print('sum: {} s'.format(end-start))

    return y1, y2, res

if __name__ == '__main__':
    main()

# results
# cos: 22.7199969292 s
# multi: 0.841291189194 s
# sum: 1.15971088409 s

C脚本:

#include <math.h>
#include <stdio.h>
#include <time.h>

const int N = 10000;
const int x_len = 100000;

int main()
{
    clock_t t_start, t_end;
    double x[x_len], y1[x_len], y2[x_len], res, time;
    int i, j;
    for( i = 0; i < x_len; i++ )
    {
        x[i] = 1.2345;
    }

    t_start = clock();
    for( j = 0; j < N; j++ )
    {
        for( i = 0; i < x_len; i++ )
        {
            y1[i] = cos(x[i]);
        }
    }
    t_end = clock();
    time = (double)(t_end - t_start) / CLOCKS_PER_SEC;
    printf("cos: %f s\n", time);

    t_start = clock();
    for( j = 0; j < N; j++ )
    {
        for( i = 0; i < x_len; i++ )
        {
            y2[i] = x[i] * 7.9463;
        }
    }
    t_end = clock();
    time = (double)(t_end - t_start) / CLOCKS_PER_SEC;
    printf("multi: %f s\n", time);

    t_start = clock();
    for( j = 0; j < N; j++ )
    {
        res = 0.0;
        for( i = 0; i < x_len; i++ )
        {
            res += x[i];
        }
    }
    t_end = clock();
    time = (double)(t_end - t_start) / CLOCKS_PER_SEC;
    printf("sum: %f s\n", time);

    return y1[0], y2[0], res;
}

// results
// cos: 20.910590 s
// multi: 0.633281 s
// sum: 1.153001 s

Python结果:

  • cos:22.7199969292 s
  • 多:0.841291189194 s
  • 总和:1.15971088409 s

C结果:

  • cos:20.910590 s
  • 多:0.633281 s
  • 总和:1.153001 s

您可以看到numpy的速度非常快,但始终比纯c慢一点.

As you can see numpy is incredibly fast, but always a bit slower than pure c.

这篇关于为什么我的python numpy代码比c ++快?的文章就介绍到这了,希望我们推荐的答案对大家有所帮助,也希望大家多多支持IT屋!

查看全文
登录 关闭
扫码关注1秒登录
发送“验证码”获取 | 15天全站免登陆