性能:Matlab与Python [英] Performance: Matlab vs Python

查看:83
本文介绍了性能:Matlab与Python的处理方法,对大家解决问题具有一定的参考价值,需要的朋友们下面随着小编来一起学习吧!

问题描述

我最近从Matlab切换到Python.在转换我冗长的代码之一时,我很惊讶地发现Python的运行速度非常慢.我用一个耗时的功能来分析和跟踪问题.我的代码中的各个位置都正在调用此函数(是其他递归调用的函数的一部分). Profiler建议在MatlabPython中都对该函数进行 300 个调用.

I recently switched from Matlab to Python. While converting one of my lengthy codes, I was surprised to find Python being very slow. I profiled and traced the problem with one function hogging up time. This function is being called from various places in my code (being part of other functions which are recursively called). Profiler suggests that 300 calls are made to this function in both Matlab and Python.

简而言之,以下代码总结了当前的问题:

In short, following codes summarizes the issue at hand:

MATLAB

MATLAB

包含函数的类:

classdef ExampleKernel1 < handle  
methods (Static)
    function [kernel] = kernel_2D(M,x,N,y) 
        kernel  = zeros(M,N);
        for i= 1 : M
            for j= 1 : N
                % Define the custom kernel function here
                kernel(i , j) = sqrt((x(i , 1) - y(j , 1)) .^ 2 + ...
                                (x(i , 2) - y(j , 2)) .^2 );             
            end
        end
    end
end
end

和调用test.m的脚本:

and the script to call test.m:

xVec=[   
49.7030   78.9590
42.6730   11.1390
23.2790   89.6720
75.6050   25.5890
81.5820   53.2920
44.9680    2.7770
38.7890   78.9050
39.1570   33.6790
33.2640   54.7200
4.8060   44.3660
49.7030   78.9590
42.6730   11.1390
23.2790   89.6720
75.6050   25.5890
81.5820   53.2920
44.9680    2.7770
38.7890   78.9050
39.1570   33.6790
33.2640   54.7200
4.8060   44.3660
];
N=size(xVec,1);
kex1=ExampleKernel1;
tic
for i=1:300
    K=kex1.kernel_2D(N,xVec,N,xVec);
end
toc

提供输出

clear all
>> test
Elapsed time is 0.022426 seconds.
>> test
Elapsed time is 0.009852 seconds.

PYTHON 3.4

PYTHON 3.4

包含函数CustomKernels.py的类:

Class containing the function CustomKernels.py:

from numpy import zeros
from math import sqrt
class CustomKernels:
"""Class for defining the custom kernel functions"""
    @staticmethod
    def exampleKernelA(M, x, N, y):
        """Example kernel function A"""
        kernel = zeros([M, N])
        for i in range(0, M):
            for j in range(0, N):
                # Define the custom kernel function here
                kernel[i, j] = sqrt((x[i, 0] - y[j, 0]) ** 2 + (x[i, 1] - y[j, 1]) ** 2)
        return kernel

和调用test.py的脚本:

and the script to call test.py:

import numpy as np
from CustomKernels import CustomKernels
from time import perf_counter

xVec = np.array([
    [49.7030,  78.9590],
    [42.6730,  11.1390],
    [23.2790,  89.6720],
    [75.6050,  25.5890],
    [81.5820,  53.2920],
    [44.9680,   2.7770],
    [38.7890,  78.9050],
    [39.1570,  33.6790],
    [33.2640,  54.7200],
    [4.8060 ,  44.3660],
    [49.7030,  78.9590],
    [42.6730,  11.1390],
    [23.2790,  89.6720],
    [75.6050,  25.5890],
    [81.5820,  53.2920],
    [44.9680,   2.7770],
    [38.7890,  78.9050],
    [39.1570,  33.6790],
    [33.2640,  54.7200],
    [4.8060 ,  44.3660]
    ])
N = xVec.shape[0]
kex1 = CustomKernels.exampleKernelA
start=perf_counter()
for i in range(0,300):
    K = kex1(N, xVec, N, xVec)
print(' %f secs' %(perf_counter()-start))

提供输出

%run test.py
 0.940515 secs
%run test.py
 0.884418 secs
%run test.py
 0.940239 secs

结果

比较结果,似乎在调用"clear all"之后Matlab大约快42倍,如果多次运行脚本而不调用"clear all",则Matlab快100倍.如果不是两个数量级,则至少快一个数量级.这对我来说是非常令人惊讶的结果.我期望结果是相反的.

Comparing the results it seems Matlab is about 42 times faster after a "clear all" is called and then 100 times faster if script is run multiple times without calling "clear all". That is at least and order of magnitude if not two orders of magnitudes faster. This is a very surprising result for me. I was expecting the result to be the other way around.

有人可以阐明这一点吗?

Can someone please shed some light on this?

有人可以建议一种更快的方法来执行此操作吗?

Can someone suggest a faster way to perform this?

侧注

我还尝试使用numpy.sqrt,这会使性能变差,因此我在Python中使用了math.sqrt.

I have also tried to use numpy.sqrt which makes the performance worse, therefore I am using math.sqrt in Python.

编辑

用于调用函数的for循环纯粹是虚构的.它们只是用来"模拟" 300 调用该函数.如前所述,内核函数(Matlab中的kernel_2DPython中的kex1)是从程序中的不同位置调用的.为了使问题更短,我使用for循环"模拟" 300 调用.由于内核矩阵的结构,内核函数中的for循环是必不可少的,并且是不可避免的.

The for loops for calling the functions are purely fictitious. They are there just to "simulate" 300 calls to the function. As I described earlier, the kernel functions (kernel_2D in Matlab and kex1 in Python) are called from various different places in the program. To make the problem shorter, I "simulate" the 300 calls using the for loop. The for loops inside the kernel functions are essential and unavoidable because of the structure of the kernel matrix.

编辑2

这是更大的问题: https://github.com/drfahdsiddiqui/bbfmm2d-python

推荐答案

您要摆脱那些for循环.试试这个:

You want to get rid of those for loops. Try this:

def exampleKernelA(M, x, N, y):
    """Example kernel function A"""
    i, j = np.indices((N, M))
    # Define the custom kernel function here
    kernel[i, j] = np.sqrt((x[i, 0] - y[j, 0]) ** 2 + (x[i, 1] - y[j, 1]) ** 2)
    return kernel

您也可以通过广播来做到这一点,这可能更快,但是来自MATLAB的直观性却有所下降.

You can also do it with broadcasting, which may be even faster, but a little less intuitive coming from MATLAB.

这篇关于性能:Matlab与Python的文章就介绍到这了,希望我们推荐的答案对大家有所帮助,也希望大家多多支持IT屋!

查看全文
登录 关闭
扫码关注1秒登录
发送“验证码”获取 | 15天全站免登陆