numpy高效的大矩阵乘法 [英] Numpy efficient big matrix multiplication

查看:661
本文介绍了numpy高效的大矩阵乘法的处理方法,对大家解决问题具有一定的参考价值,需要的朋友们下面随着小编来一起学习吧!

问题描述

要在磁盘上存储大矩阵,请使用numpy.memmap.

To store big matrix on disk I use numpy.memmap.

以下是测试大矩阵乘法的示例代码:

Here is a sample code to test big matrix multiplication:

import numpy as np
import time

rows= 10000 # it can be large for example 1kk
cols= 1000

#create some data in memory
data = np.arange(rows*cols, dtype='float32') 
data.resize((rows,cols))

#create file on disk
fp0 = np.memmap('C:/data_0', dtype='float32', mode='w+', shape=(rows,cols))
fp1 = np.memmap('C:/data_1', dtype='float32', mode='w+', shape=(rows,cols))

fp0[:]=data[:]
fp1[:]=data[:]

#matrix transpose test
tr = np.memmap('C:/data_tr', dtype='float32', mode='w+', shape=(cols,rows))
tr= np.transpose(fp1)  #memory consumption?
print fp1.shape
print tr.shape

res = np.memmap('C:/data_res', dtype='float32', mode='w+', shape=(rows,rows))
t0 = time.time()
# redifinition ? res= np.dot(fp0,tr) #takes 342 seconds on my machine, if I multiplicate matrices in RAM it takes 345 seconds (I thinks it's a strange result)
res[:]= np.dot(fp0,tr) # assignment ?
print res.shape
print (time.time() - t0)

所以我的问题是:

  1. 如何将使用此过程的复制的内存消耗限制为某个值,例如100Mb(或1Gb或其他),我也不了解如何估算该过程的内存消耗(我认为仅分配了内存)创建数据"变量时,但是使用内存映射文件时使用了多少内存?)
  2. 也许有一些针对存储在磁盘上的大型矩阵相乘的最佳解决方案?例如,也许不是最佳地存储在磁盘上或无法从磁盘读取数据,没有得到适当处理,点产品仅使用一个内核.也许我应该使用诸如PyTables之类的东西?

我也对解决使用有限内存的方程式线性系统(SVD和其他)感兴趣的算法. 也许这种算法称为核外"或迭代",我认为有一些类比,例如硬盘驱动器-ram,gpu ram -cpu ram,cpu ram -cpu缓存.

Also I interested in algorithms solving linear system of equations (SVD and others) with restricted memory usage. Maybe this algorithms called out-of-core or iterative and I think there some analogy like hard drive<->ram, gpu ram<->cpu ram, cpu ram<->cpu cache.

也在此处,我在PyTables中发现了一些有关矩阵乘法的信息.

Also here I found some info about matrix multiplication in PyTables.

我还在R中找到了,但是我需要它用于Python或Matlab.

Also I found this in R but I need it for Python or Matlab.

推荐答案

Dask.array 使用阻止的算法和任务调度为大型磁盘阵列提供numpy接口.它可以轻松地进行核外矩阵乘法和其他简单的numpy运算.

Dask.array provides a numpy interface to large on-disk arrays using blocked algorithms and task scheduling. It can easily do out-of-core matrix multiplies and other simple-ish numpy operations.

封闭线性代数比较难,您可能需要查看有关此主题的一些学术著作. Dask确实支持在细长矩阵上进行QR和SVD分解.

Blocked linear algebra is harder and you might want to check out some of the academic work on this topic. Dask does support QR and SVD factorizations on tall-and-skinny matrices.

无论对于大型阵列,您实际上都希望使用阻塞算法,而不是幼稚的遍历,这种遍历会以不愉快的方式击中磁盘.

Regardless for large arrays, you really want blocked algorithms, not naive traversals which will hit disk in unpleasant ways.

这篇关于numpy高效的大矩阵乘法的文章就介绍到这了,希望我们推荐的答案对大家有所帮助,也希望大家多多支持IT屋!

查看全文
登录 关闭
扫码关注1秒登录
发送“验证码”获取 | 15天全站免登陆