Sed:降低数据处理速度 [英] Sed: Decreasing speed of data processing

查看:119
本文介绍了Sed:降低数据处理速度的处理方法,对大家解决问题具有一定的参考价值,需要的朋友们下面随着小编来一起学习吧!

问题描述

我有大文件(10-20 GB),在使用Gnuplot绘制数据之前,我用Sed对其进行了预处理.地块另存为.png图像. data文件由大小为matrix_size x matrix_sizeimages个矩阵组成.大小为matrix_size=2的两个(images=3)矩阵的data文件如下所示:

I have large files (10-20 GB) which I preprocess with Sed before I plot the data using Gnuplot. The plots are saved as .png image. The data file consists of images matrices of size matrix_size x matrix_size. The data file for two (images=3) matrices of size matrix_size=2 looks like:

 1 2
 3 2
 1 5
 3 4
 5 2
 2 3

我使用Sed提取data文件的每个矩阵.刚开始时,这种情况发生得非常快,我的脚本每秒产生一张图像.但是过了一会儿,每个图像的时间最多增加25秒.为什么会这样呢?这是我的代码:

I use Sed to extract each matrix of the data file. At the beginning this happens really fast and my script produces one image per second. But after a while the time increases up to 25 seconds per image. Why is this the case? Here is my code:

unset border
unset key
unset xtics
unset ytics
unset ztics
unset colorbox

set autoscale fix
set size ratio -1

file = 'data'
matrix_size = 1000
images = 1000

sizeX = matrix_size
sizeY = matrix_size
set xrange [1:matrix_size]
set yrange [1:matrix_size]
set terminal png size sizeX, sizeY

getMatrix(fileName, n, i) = sprintf("<sed -n '%d,%dp;%dq' '%s'", (i-1)*n + 1, i*n, i*n+1, fileName)

do for [i=1:images]{
    t0 = strftime('%s', time(0))    
    set output sprintf('%05d_%s.png', i, file)
    plot getMatrix(file, matrix_size, i) matrix with image
    t1 = strftime('%s', time(0))
    print(sprintf('%d %d', t1-t0, i))
}

这是每张图像绘制所需的时间(以秒为单位).一开始非常快,然后越来越慢:

Here is the time it takes in seconds for every image to plot. At the beginning very fast and then slower and slower:

推荐答案

我建议您使用split一次性将所有矩阵提取到单个文件中:

I would suggest you use split to extract all your matrices to individual files, up front, in a single pass:

split -a 4 -d -l matrix_size data matrix-

这将把每个矩阵放在一个单独的文件中,该文件称为matrix-0000,如果我理解您的文件格式,则为matrix-0001.

That will put each matrix in a separate file called matrix-0000, matrix-0001 if I understood your file format.

这篇关于Sed:降低数据处理速度的文章就介绍到这了,希望我们推荐的答案对大家有所帮助,也希望大家多多支持IT屋!

查看全文
登录 关闭
扫码关注1秒登录
发送“验证码”获取 | 15天全站免登陆