高斯核性能 [英] Gaussian kernel performance

查看:59
本文介绍了高斯核性能的处理方法,对大家解决问题具有一定的参考价值,需要的朋友们下面随着小编来一起学习吧!

问题描述

以下方法计算高斯核:

将 numpy 导入为 npdef gaussian_kernel(X, X2, sigma):"""计算高斯核矩阵k_ij = exp(-||x_i - x_j||^2/(2 * sigma^2)):param X: 类似数组,形状=(n_samples_1, n_features),特征矩阵:param X2: 类似数组,形状=(n_samples_2, n_features),特征矩阵:param sigma: 标量,带宽参数:return: 类数组,形状=(n_samples_1, n_samples_2),核矩阵"""norm = np.square(np.linalg.norm(X[None,:,:] - X2[:,None,:],axis=2).T)返回 np.exp(-norm/(2*np.square(sigma)))# 使用示例%timeit gaussian_kernel(np.random.rand(5000, 10), np.random.rand(5000, 10), 1)

<块引用>

每个循环 1.43 秒 ± 39.3 毫秒(平均值 ± 标准偏差,7 次运行,每个循环 1 个)

我的问题是:有没有办法使用 numpy 提高性能?

解决方案

对于非常小的数组,您可以编写一个简单的循环实现并使用 Numba 编译它.对于更大的数组,使用 np.dot() 的代数重构会更快.

示例

#from version 0.43 到 0.47 这必须在导入 numba 之前设置#Bug:https://github.com/numba/numba/issues/4689从 llvmlite 导入绑定binding.set_option('SVML', '-vector-library=SVML')将 numba 导入为 nb将 numpy 导入为 np@nb.njit(fastmath=True,error_model="numpy",parallel=True)def gaussian_kernel_2(X, X2, sigma):res=np.empty((X.shape[0],X2.shape[0]),dtype=X.dtype)对于 nb.prange(X.shape[0]) 中的 i:对于范围内的 j(X2.shape[0]):acc=0.对于范围内的 k(X.shape[1]):acc+=(X[i,k]-X2[j,k])**2/(2*sigma**2)res[i,j]=np.exp(-1*acc)返回资源

时间

X1=np.random.rand(5000, 10)X2=np.random.rand(5000, 10)#你的解决方案%timeit gaussian_kernel(X1,X2, 1)#511 ms ± 10.7 ms 每个循环(平均值 ± 标准差.7 次运行,每次 1 次循环)%timeit gaussian_kernel_2(X1,X2, 1)#90.1 ms ± 9.14 ms 每个循环(平均值 ± 标准差.7 次运行,每次 1 次循环)

Following method calculates a gaussian kernel:

import numpy as np
def gaussian_kernel(X, X2, sigma):
    """
    Calculate the Gaussian kernel matrix

        k_ij = exp(-||x_i - x_j||^2 / (2 * sigma^2))

    :param X: array-like, shape=(n_samples_1, n_features), feature-matrix
    :param X2: array-like, shape=(n_samples_2, n_features), feature-matrix
    :param sigma: scalar, bandwidth parameter

    :return: array-like, shape=(n_samples_1, n_samples_2), kernel matrix
    """

    norm = np.square(np.linalg.norm(X[None,:,:] - X2[:,None,:], axis=2).T)    
    return np.exp(-norm/(2*np.square(sigma)))

# Usage example
%timeit gaussian_kernel(np.random.rand(5000, 10), np.random.rand(5000, 10), 1)

1.43 s ± 39.3 ms per loop (mean ± std. dev. of 7 runs, 1 loop each)

My question is: is there any ways to increase performance using numpy?

解决方案

For quite small arrays you can write a simple loop implementation and compile it using Numba. For larger arrays the algebraic reformulation using np.dot() will be faster.

Example

#from version 0.43 until 0.47 this has to be set before importing numba
#Bug: https://github.com/numba/numba/issues/4689
from llvmlite import binding
binding.set_option('SVML', '-vector-library=SVML')
import numba as nb
import numpy as np

@nb.njit(fastmath=True,error_model="numpy",parallel=True)
def gaussian_kernel_2(X, X2, sigma):
    res=np.empty((X.shape[0],X2.shape[0]),dtype=X.dtype)
    for i in nb.prange(X.shape[0]):
        for j in range(X2.shape[0]):
            acc=0.
            for k in range(X.shape[1]):
                acc+=(X[i,k]-X2[j,k])**2/(2*sigma**2)
            res[i,j]=np.exp(-1*acc)
    return res

Timings

X1=np.random.rand(5000, 10)
X2=np.random.rand(5000, 10)

#Your solution
%timeit gaussian_kernel(X1,X2, 1)
#511 ms ± 10.7 ms per loop (mean ± std. dev. of 7 runs, 1 loop each)
%timeit gaussian_kernel_2(X1,X2, 1)
#90.1 ms ± 9.14 ms per loop (mean ± std. dev. of 7 runs, 1 loop each)

这篇关于高斯核性能的文章就介绍到这了,希望我们推荐的答案对大家有所帮助,也希望大家多多支持IT屋!

查看全文
登录 关闭
扫码关注1秒登录
发送“验证码”获取 | 15天全站免登陆