Python(numpy)使包含大量数组元素的系统崩溃 [英] Python (numpy) crashes system with large number of array elements

查看：67 发布时间：2021/4/16 20:42:02 python arrays numpy classification ocr

本文介绍了Python(numpy)使包含大量数组元素的系统崩溃的处理方法，对大家解决问题具有一定的参考价值，需要的朋友们下面随着小编来一起学习吧！

问题描述

我正在尝试使用scikit提供的许多分类器来构建基本的字符识别模型.所使用的数据集是标准的手写字母数字样本集(来自此

I'm trying to build a basic character recognition model using the many classifiers that scikit provides. The dataset being used is a standard handwritten set of alphanumeric samples (Chars74K image dataset taken from this source: EnglishHnd.tgz).

每个字符有55个样本(总共62个字母数字字符)，每个样本均为900x1200像素.我将矩阵(首先转换为灰度)展平为1x1080000数组(每个数组代表一个特征).

There are 55 samples of each character (62 alphanumeric characters in all), each being 900x1200 pixels. I'm flattening the matrix (first converting to grayscale) into a 1x1080000 array (each representing a feature).

for sample in sample_images: # sample images is the list of the .png files
    img = imread(sample);
    img_gray = rgb2gray(img);
    if n == 0 and m == 0: # n and m are global variables
        n, m = np.shape(img_gray);
    img_gray = np.reshape(img_gray, n*m);
    img_gray = np.append(img_gray, sample_id); # sample id stores the label of the training sample
    if len(samples) == 0: # samples is the final numpy ndarray
        samples = np.append(samples, img_gray);
        samples = np.reshape(samples, [1, n*m + 1]);
    else:
        samples = np.append(samples, [img_gray], axis=0);

因此，最终数据结构应具有55x62阵列，其中每个阵列的容量为1080000个元素.仅存储最终结构(中间矩阵的范围是局部的).

So the final data structure should have 55x62 arrays, where each array is 1080000 elements in capacity. Only the final structure is being stored (the scope of the intermediate matrices is local).

为学习该模型而存储的数据量非常大(我想)，因为该程序实际上并没有进展到一定程度，并且使我的系统崩溃到必须修复BIOS的程度！

The amount of data being stored to learn the model is pretty large (I guess), because the program isn't really progressing beyond a point, and crashed my system to the extent that the BIOS had to be repaired!

到目前为止，该程序仅收集要发送给分类器的数据...分类还没有引入代码中.

Upto this point, the program is only gathering the data to send to the classifier ... the classification hasn't even been introduced into the code yet.

关于如何做才能更有效地处理数据的任何建议?

Any suggestions as to what can be done to handle the data more efficiently?

注意:我正在使用numpy来存储扁平化矩阵的最终结构.此外，系统具有8Gb RAM.

Note: I'm using numpy to store the final structure of flattened matrices. Also, the system has an 8Gb RAM.

Python(numpy)使包含大量数组元素的系统崩溃 [英] Python (numpy) crashes system with large number of array elements

问题描述

推荐答案

相关文章

Python最新文章

热门教程

热门工具

登录关闭

Python(numpy)使包含大量数组元素的系统崩溃 [英] Python (numpy) crashes system with large number of array elements

问题描述

推荐答案

相关文章

Python最新文章

热门教程

热门工具

登录 关闭

登录关闭