使用x轴偏移量在python(5GB)中绘制真正的大文件 [英] Plot really big file in python (5GB) with x axis offset
问题描述
我正在尝试使用python和matplotlib绘制一个非常大的文件(约5 GB).我能够将整个文件加载到内存中(计算机中可用的总大小为16 GB),但是当我使用简单的imshow对其进行绘制时,出现了分段错误.这是我设置为15000的ulimit的最大可能,但是我不能设置更高的值.我得出的结论是,我需要分批绘制数组,因此编写了一个简单的代码即可.我的主要观点是,当我绘制一批大数组时,x坐标始终从0开始,并且无法覆盖图像以创建最终的大数组.如果您有任何建议,请告诉我.另外,由于具有管理权限,我无法在此计算机上安装新的软件包,例如"Image".这是读取我的数组的前12行并绘制3个图的代码示例.
I am trying to plot a very big file (~5 GB) using python and matplotlib. I am able to load the whole file in memory (the total available in the machine is 16 GB) but when I plot it using simple imshow I get a segmentation fault. This is most probable to the ulimit which I have set to 15000 but I cannot set higher. I have come to the conclusion that I need to plot my array in batches and therefore made a simple code to do that. My main isue is that when I plot a batch of the big array the x coordinates start always from 0 and there is no way I can overlay the images to create a final big one. If you have any suggestion please let me know. Also I am not able to install new packages like "Image" on this machine due to administrative rights. Here is a sample of the code that reads the first 12 lines of my array and make 3 plots.
import os
import sys
import scipy
import numpy as np
import pylab as pl
import matplotlib as mpl
import matplotlib.cm as cm
from optparse import OptionParser
from scipy import fftpack
from scipy.fftpack import *
from cmath import *
from pylab import *
import pp
import fileinput
import matplotlib.pylab as plt
import pickle
def readalllines(file1,rows,freqs):
file = open(file1,'r')
sizer = int(rows*freqs)
i = 0
q = np.zeros(sizer,'float')
for i in range(rows*freqs):
s =file.readline()
s = s.split()
#print s[4],q[i]
q[i] = float(s[4])
if i%262144 == 0:
print '\r ',int(i*100.0/(337*262144)),' percent complete',
i += 1
file.close()
return q
parser = OptionParser()
parser.add_option('-f',dest="filename",help="Read dynamic spectrum from FILE",metavar="FILE")
parser.add_option('-t',dest="dtime",help="The time integration used in seconds, default 10",default=10)
parser.add_option('-n',dest="dfreq",help="The bandwidth of each frequency channel in Hz",default=11.92092896)
parser.add_option('-w',dest="reduce",help="The chuncker divider in frequency channels, integer default 16",default=16)
(opts,args) = parser.parse_args()
rows=12
freqs = 262144
file1 = opts.filename
s = readalllines(file1,rows,freqs)
s = np.reshape(s,(rows,freqs))
s = s.T
print s.shape
#raw_input()
#s_shift = scipy.fftpack.fftshift(s)
#fig = plt.figure()
#fig.patch.set_alpha(0.0)
#axes = plt.axes()
#axes.patch.set_alpha(0.0)
###plt.ylim(0,8)
plt.ion()
i = 0
for o in range(0,rows,4):
fig = plt.figure()
#plt.clf()
plt.imshow(s[:,o:o+4],interpolation='nearest',aspect='auto', cmap=cm.gray_r, origin='lower')
if o == 0:
axis([0,rows,0,freqs])
fdf, fdff = xticks()
print fdf
xticks(fdf+o)
print xticks()
#axis([o,o+4,0,freqs])
plt.draw()
#w, h = fig.canvas.get_width_height()
#buf = np.fromstring(fig.canvas.tostring_argb(), dtype=np.uint8)
#buf.shape = (w,h,4)
#buf = np.rol(buf, 3, axis=2)
#w,h,_ = buf.shape
#img = Image.fromstring("RGBA", (w,h),buf.tostring())
#if prev:
# prev.paste(img)
# del prev
#prev = img
i += 1
pl.colorbar()
pl.show()
推荐答案
我认为您只是缺少了plt.imshow
中的extent=(left, right, bottom, top)
关键字参数.
I think you're just missing the extent=(left, right, bottom, top)
keyword argument in plt.imshow
.
x = np.random.randn(2, 10)
y = np.ones((4, 10))
x[0] = 0 # To make it clear which side is up, etc
y[0] = -1
plt.imshow(x, extent=(0, 10, 0, 2))
plt.imshow(y, extent=(0, 10, 2, 6))
# This is necessary, else the plot gets scaled and only shows the last array
plt.ylim(0, 6)
plt.colorbar()
plt.show()
这篇关于使用x轴偏移量在python(5GB)中绘制真正的大文件的文章就介绍到这了,希望我们推荐的答案对大家有所帮助,也希望大家多多支持IT屋!