性能:Matlab 与 Python [英] Performance: Matlab vs Python

查看:21
本文介绍了性能:Matlab 与 Python的处理方法,对大家解决问题具有一定的参考价值,需要的朋友们下面随着小编来一起学习吧!

问题描述

我最近从 Matlab 切换到 Python.在转换我冗长的代码之一时,我惊讶地发现 Python 非常慢.我用一个函数占用了时间来分析和跟踪问题.这个函数是从我的代码中的不同地方调用的(作为递归调用的其他函数的一部分).Profiler 建议在 MatlabPython 中对该函数进行 300 次调用.

I recently switched from Matlab to Python. While converting one of my lengthy codes, I was surprised to find Python being very slow. I profiled and traced the problem with one function hogging up time. This function is being called from various places in my code (being part of other functions which are recursively called). Profiler suggests that 300 calls are made to this function in both Matlab and Python.

简而言之,以下代码总结了手头的问题:

In short, following codes summarizes the issue at hand:

MATLAB

包含函数的类:

classdef ExampleKernel1 < handle  
methods (Static)
    function [kernel] = kernel_2D(M,x,N,y) 
        kernel  = zeros(M,N);
        for i= 1 : M
            for j= 1 : N
                % Define the custom kernel function here
                kernel(i , j) = sqrt((x(i , 1) - y(j , 1)) .^ 2 + ...
                                (x(i , 2) - y(j , 2)) .^2 );             
            end
        end
    end
end
end

以及调用 test.m 的脚本:

and the script to call test.m:

xVec=[   
49.7030   78.9590
42.6730   11.1390
23.2790   89.6720
75.6050   25.5890
81.5820   53.2920
44.9680    2.7770
38.7890   78.9050
39.1570   33.6790
33.2640   54.7200
4.8060   44.3660
49.7030   78.9590
42.6730   11.1390
23.2790   89.6720
75.6050   25.5890
81.5820   53.2920
44.9680    2.7770
38.7890   78.9050
39.1570   33.6790
33.2640   54.7200
4.8060   44.3660
];
N=size(xVec,1);
kex1=ExampleKernel1;
tic
for i=1:300
    K=kex1.kernel_2D(N,xVec,N,xVec);
end
toc

给出输出

clear all
>> test
Elapsed time is 0.022426 seconds.
>> test
Elapsed time is 0.009852 seconds.

PYTHON 3.4

包含函数 CustomKernels.py 的类:

Class containing the function CustomKernels.py:

from numpy import zeros
from math import sqrt
class CustomKernels:
"""Class for defining the custom kernel functions"""
    @staticmethod
    def exampleKernelA(M, x, N, y):
        """Example kernel function A"""
        kernel = zeros([M, N])
        for i in range(0, M):
            for j in range(0, N):
                # Define the custom kernel function here
                kernel[i, j] = sqrt((x[i, 0] - y[j, 0]) ** 2 + (x[i, 1] - y[j, 1]) ** 2)
        return kernel

和调用test.py的脚本:

and the script to call test.py:

import numpy as np
from CustomKernels import CustomKernels
from time import perf_counter

xVec = np.array([
    [49.7030,  78.9590],
    [42.6730,  11.1390],
    [23.2790,  89.6720],
    [75.6050,  25.5890],
    [81.5820,  53.2920],
    [44.9680,   2.7770],
    [38.7890,  78.9050],
    [39.1570,  33.6790],
    [33.2640,  54.7200],
    [4.8060 ,  44.3660],
    [49.7030,  78.9590],
    [42.6730,  11.1390],
    [23.2790,  89.6720],
    [75.6050,  25.5890],
    [81.5820,  53.2920],
    [44.9680,   2.7770],
    [38.7890,  78.9050],
    [39.1570,  33.6790],
    [33.2640,  54.7200],
    [4.8060 ,  44.3660]
    ])
N = xVec.shape[0]
kex1 = CustomKernels.exampleKernelA
start=perf_counter()
for i in range(0,300):
    K = kex1(N, xVec, N, xVec)
print(' %f secs' %(perf_counter()-start))

给出输出

%run test.py
 0.940515 secs
%run test.py
 0.884418 secs
%run test.py
 0.940239 secs

结果

比较结果似乎 Matlab 在调用clear all"后大约快 42 倍,如果脚本多次运行而不调用则快 100 倍"清除所有".这至少是一个数量级,如果不是快两个数量级的话.这对我来说是一个非常令人惊讶的结果.我期待结果是相反的.

Comparing the results it seems Matlab is about 42 times faster after a "clear all" is called and then 100 times faster if script is run multiple times without calling "clear all". That is at least and order of magnitude if not two orders of magnitudes faster. This is a very surprising result for me. I was expecting the result to be the other way around.

有人可以解释一下吗?

有人可以建议一种更快的方法来执行此操作吗?

Can someone suggest a faster way to perform this?

边注

我也尝试使用 numpy.sqrt 这会使性能变差,因此我在 Python 中使用 math.sqrt.

I have also tried to use numpy.sqrt which makes the performance worse, therefore I am using math.sqrt in Python.

编辑

用于调用函数的 for 循环纯属虚构.它们只是为了模拟"300 次调用函数.正如我之前所描述的,内核函数(Matlab 中的 kernel_2DPython 中的 kex1)是从各种不同的程序中的位置.为了简化问题,我使用 for 循环模拟"300 次调用.由于核矩阵的结构,核函数内的for循环是必不可少的,也是不可避免的.

The for loops for calling the functions are purely fictitious. They are there just to "simulate" 300 calls to the function. As I described earlier, the kernel functions (kernel_2D in Matlab and kex1 in Python) are called from various different places in the program. To make the problem shorter, I "simulate" the 300 calls using the for loop. The for loops inside the kernel functions are essential and unavoidable because of the structure of the kernel matrix.

编辑 2

这里是更大的问题:https://github.com/drfahdsiddiqui/bbfmm2d-python

推荐答案

您想摆脱那些 for 循环.试试这个:

You want to get rid of those for loops. Try this:

def exampleKernelA(M, x, N, y):
    """Example kernel function A"""
    i, j = np.indices((N, M))
    # Define the custom kernel function here
    kernel[i, j] = np.sqrt((x[i, 0] - y[j, 0]) ** 2 + (x[i, 1] - y[j, 1]) ** 2)
    return kernel

您也可以通过广播来实现,这可能会更快,但来自 MATLAB 的直观性稍差.

You can also do it with broadcasting, which may be even faster, but a little less intuitive coming from MATLAB.

这篇关于性能:Matlab 与 Python的文章就介绍到这了,希望我们推荐的答案对大家有所帮助,也希望大家多多支持IT屋!

查看全文
登录 关闭
扫码关注1秒登录
发送“验证码”获取 | 15天全站免登陆