当矩阵的值很小时,为什么矩阵乘积会变慢? [英] Why is matrix product slower when matrix has very small values?

查看:109
本文介绍了当矩阵的值很小时,为什么矩阵乘积会变慢?的处理方法,对大家解决问题具有一定的参考价值,需要的朋友们下面随着小编来一起学习吧!

问题描述

我创建两个维度相同的矩阵AB. A包含的值大于B.矩阵乘法A %*% A大约比B %*% B快10倍.

这是为什么?

## disable openMP
library(RhpcBLASctl); blas_set_num_threads(1); omp_set_num_threads(1)

A <- exp(-as.matrix(dist(expand.grid(1:60, 1:60))))
summary(c(A))
#     Min.  1st Qu.   Median     Mean  3rd Qu.     Max. 
# 0.000000 0.000000 0.000000 0.001738 0.000000 1.000000 

B <- exp(-as.matrix(dist(expand.grid(1:60, 1:60)))*10)
summary(c(B))
#      Min.   1st Qu.    Median      Mean   3rd Qu.      Max. 
# 0.0000000 0.0000000 0.0000000 0.0002778 0.0000000 1.0000000 

identical(dim(A), dim(B))
## [1] TRUE

system.time(A %*% A)
#    user  system elapsed 
#   2.387   0.001   2.389 
system.time(B %*% B)
#    user  system elapsed 
#  21.285   0.020  21.310

sessionInfo()
# R version 3.6.1 (2019-07-05)
# Platform: x86_64-pc-linux-gnu (64-bit)
# Running under: Linux Mint 19.2

# Matrix products: default
# BLAS:   /usr/lib/x86_64-linux-gnu/openblas/libblas.so.3
# LAPACK: /usr/lib/x86_64-linux-gnu/libopenblasp-r0.2.20.so

该问题可能与 base :: chol()在矩阵包含许多小条目时会变慢. >

:其中有些数字很小,这似乎会使计算速度变慢.其他人则没有.

slow <-  6.41135533887904e-164
fast1 <- 6.41135533887904e-150
fast2 <- 6.41135533887904e-170

Mslow <- array(slow, c(1000, 1000)); system.time(Mslow %*% Mslow)
#   user  system elapsed 
# 10.165   0.000  10.168 

Mfast1 <- array(fast1, c(1000, 1000)); system.time(Mfast1 %*% Mfast1)
#   user  system elapsed 
#  0.058   0.000   0.057 

Mfast2 <- array(fast2, c(1000, 1000)); system.time(Mfast2 %*% Mfast2)
#   user  system elapsed 
#  0.056   0.000   0.055 

解决方案

您很可能希望使用.Machine$double.xmin而不是double.eps.这样会将较少的数字设置为零,并具有相同的效果.为了避免出现非正常数,您可能必须使用将这些数设置为零的编译器标志而不是引发FP陷阱来重新编译BLAS.

I create two matrices A and B of the same dimension. A contains larger values than B. The matrix multiplication A %*% A is about 10 times faster than B %*% B.

Why is this?

## disable openMP
library(RhpcBLASctl); blas_set_num_threads(1); omp_set_num_threads(1)

A <- exp(-as.matrix(dist(expand.grid(1:60, 1:60))))
summary(c(A))
#     Min.  1st Qu.   Median     Mean  3rd Qu.     Max. 
# 0.000000 0.000000 0.000000 0.001738 0.000000 1.000000 

B <- exp(-as.matrix(dist(expand.grid(1:60, 1:60)))*10)
summary(c(B))
#      Min.   1st Qu.    Median      Mean   3rd Qu.      Max. 
# 0.0000000 0.0000000 0.0000000 0.0002778 0.0000000 1.0000000 

identical(dim(A), dim(B))
## [1] TRUE

system.time(A %*% A)
#    user  system elapsed 
#   2.387   0.001   2.389 
system.time(B %*% B)
#    user  system elapsed 
#  21.285   0.020  21.310

sessionInfo()
# R version 3.6.1 (2019-07-05)
# Platform: x86_64-pc-linux-gnu (64-bit)
# Running under: Linux Mint 19.2

# Matrix products: default
# BLAS:   /usr/lib/x86_64-linux-gnu/openblas/libblas.so.3
# LAPACK: /usr/lib/x86_64-linux-gnu/libopenblasp-r0.2.20.so

The question could be related to base::chol() slows down when matrix contains many small entries.

Edit: There are some small numbers, which seems to slow down computations. Others do not.

slow <-  6.41135533887904e-164
fast1 <- 6.41135533887904e-150
fast2 <- 6.41135533887904e-170

Mslow <- array(slow, c(1000, 1000)); system.time(Mslow %*% Mslow)
#   user  system elapsed 
# 10.165   0.000  10.168 

Mfast1 <- array(fast1, c(1000, 1000)); system.time(Mfast1 %*% Mfast1)
#   user  system elapsed 
#  0.058   0.000   0.057 

Mfast2 <- array(fast2, c(1000, 1000)); system.time(Mfast2 %*% Mfast2)
#   user  system elapsed 
#  0.056   0.000   0.055 

解决方案

You most likely want to use .Machine$double.xmin instead of double.eps. This sets way less numbers to zero and has the same effect. To avoid subnormal numbers you might have to recompile BLAS using compiler flags that set those numbers to zero instead of raising a FP trap.

这篇关于当矩阵的值很小时,为什么矩阵乘积会变慢?的文章就介绍到这了,希望我们推荐的答案对大家有所帮助,也希望大家多多支持IT屋!

查看全文
登录 关闭
扫码关注1秒登录
发送“验证码”获取 | 15天全站免登陆