内存错误而调用genfromtxt方法 [英] Memory Error while calling genfromtxt method
本文介绍了内存错误而调用genfromtxt方法的处理方法,对大家解决问题具有一定的参考价值,需要的朋友们下面随着小编来一起学习吧!
问题描述
code:
import scipy as sp
import matplotlib.pyplot as plt
data=sp.genfromtxt("data/train.tsv", delimiter ="\t", dtype="string", comments=None, skip_header=1)
x = data[:,0]
y = data[:,1]
x = x[~sp.isnan(y)]
y = x[~sp.isnan(y)]
DataOfInterest=x["avglinksize"]
EphemeralOrEvergreen=x["label"]
plt.scatter(DataOfInterest,EphemeralOrEvergreen)
plt.title("Training data")
plt.xlabel("Single feature from training set")
plt.ylabel("Ephemeral or Evergreen")
plt.grid()
plt.show()
输出:
蟒蛇GenGraphs.py
python GenGraphs.py
Traceback (most recent call last):
File "GenGraphs.py", line 4, in <module>
data=sp.genfromtxt("data/train.tsv", delimiter ="\t", dtype="string", comments=None, skip_header=1)
File "/usr/lib/python2.7/dist-packages/numpy/lib/npyio.py", line 1746, in genfromtxt
output = np.array(data, dtype)
MemoryError
我想在对阵另一TSV文件,以图一列。
I am trying to graph one column in the tsv file against another.
我有什么误解吗?我还能怎么办呢?
What have I misunderstood here? How else can I do this ?
推荐答案
您可以使用加载它 np.memmap
,它会要求你约70MB:
You can load it using a np.memmap
, which will demand you about 70MB:
import numpy as np
with open('train.tsv') as f:
mm = np.memmap('test.memmap', shape=(7395, 27), dtype='|S4000', mode='w+')
f.next()
for i, l in enumerate(f):
mm[i,:] = l.strip().replace('"','').split('\t')
当您删除 M
与德尔米
或当您关闭了Python控制台中的文件被保存。也许你将不得不在创建文件后,模式切换到 R +
。
The file is saved when you delete m
with del m
or when you close the Python console. Maybe you will have to change the mode to r+
after the file is created.
您可以用MEMMAP阵列工作,就好像它是一个正常的阵列,这将允许你只需要关注的部分。
You can work with the memmap array as if it was a normal array, which will allow you to take only the parts of interest.
这篇关于内存错误而调用genfromtxt方法的文章就介绍到这了,希望我们推荐的答案对大家有所帮助,也希望大家多多支持IT屋!
查看全文