转换为数组时Python发生内存错误 [英] Memory Error at Python while converting to array

查看:282
本文介绍了转换为数组时Python发生内存错误的处理方法,对大家解决问题具有一定的参考价值,需要的朋友们下面随着小编来一起学习吧!

问题描述

我的代码如下所示:

from sklearn.datasets import load_svmlight_files
import numpy as np

perm1 =np.random.permutation(25000)
perm2 = np.random.permutation(25000)

X_tr, y_tr, X_te, y_te = load_svmlight_files(("dir/file.feat", "dir/file.feat"))

#randomly shuffle data
X_train = X_tr[perm1,:].toarray()[:,0:2000]
y_train = y_tr[perm1]>5 #turn into binary problem

代码在这里工作正常,但是当我尝试将另一个对象转换为数组时,我的程序返回内存错误.

The code works fine until here, but when I try to convert one more object to an array, my program returns a memory error.

代码:

X_test = X_te[perm2,:].toarray()[:,0:2000]

错误:

---------------------------------------------------------------------------
MemoryError                               Traceback (most recent call last)
<ipython-input-7-31f5e4f6b00c> in <module>()
----> 1 X_test = X_test.toarray()

C:\Users\Asq\AppData\Local\Enthought\Canopy\User\lib\site-packages\scipy\sparse\compressed.pyc in toarray(self, order, out)
    788     def toarray(self, order=None, out=None):
    789         """See the docstring for `spmatrix.toarray`."""
--> 790         return self.tocoo(copy=False).toarray(order=order, out=out)
    791 
    792     ##############################################################

C:\Users\Asq\AppData\Local\Enthought\Canopy\User\lib\site-packages\scipy\sparse\coo.pyc in toarray(self, order, out)
    237     def toarray(self, order=None, out=None):
    238         """See the docstring for `spmatrix.toarray`."""
--> 239         B = self._process_toarray_args(order, out)
    240         fortran = int(B.flags.f_contiguous)
    241         if not fortran and not B.flags.c_contiguous:

C:\Users\Asq\AppData\Local\Enthought\Canopy\User\lib\site-packages\scipy\sparse\base.pyc in _process_toarray_args(self, order, out)
    697             return out
    698         else:
--> 699             return np.zeros(self.shape, dtype=self.dtype, order=order)
    700 
    701 

MemoryError: 

我是python的新手,我不知道是否需要手动修复内存错误.

I'm new in python, and I dont know whether one needs to manually fix the memory error.

我的代码的其他部分返回相同的错误(例如使用knn或ann进行训练).

Other parts of my code return the same errors (like training with knn or ann).

我该如何解决?

推荐答案

在这种情况下,通常可以避免将稀疏矩阵转换为密集格式.

In cases like these, it's often possible to avoid converting your sparse matrices to dense format.

例如,您可以进行排列并轻松切片具有CSR或CSC稀疏格式.

For example, you can do the permutation and slice easily with CSR or CSC sparse formats.

您还没有发布下面的代码,但是我怀疑它也可以用来处理稀疏输入.如果是这样,那么您的内存问题将不再是问题.

You haven't posted the code that follows, but I suspect that can be made to handle sparse inputs as well. If that's true, your memory issues will no longer be a problem.

这篇关于转换为数组时Python发生内存错误的文章就介绍到这了,希望我们推荐的答案对大家有所帮助,也希望大家多多支持IT屋!

查看全文
登录 关闭
扫码关注1秒登录
发送“验证码”获取 | 15天全站免登陆