在Ubuntu上绘制大型示波器文件时出现MemoryError [英] MemoryError while plotting large oscilloscope files on Ubuntu

查看:53
本文介绍了在Ubuntu上绘制大型示波器文件时出现MemoryError的处理方法,对大家解决问题具有一定的参考价值,需要的朋友们下面随着小编来一起学习吧!

问题描述

我正在尝试读取大型示波器.trc文件并绘制它们.绘制一个文件是可行的,但是一旦将脚本放入循环中,尝试绘制所有文件(一个文件一个循环),我就会得到一个MemoryError.

代码:

import os
import pandas as pd
import numpy as np
import matplotlib.pyplot as plt
import readTrc #external file, same location as script

foldername = 'trc_folder'
folder = os.listdir(foldername)
path = os.path.dirname(os.path.realpath(__file__))

for filenumber, i in enumerate(folder):
    trc = path + '/' + foldername + '/' + i

    print('reading trc file ' + str(filenumber))

    datX, datY, m = readTrc.readTrc(trc)
    srx, sry = pd.Series(datX * 1000), pd.Series(datY * 1000)
    df_oszi = pd.concat([srx, sry], axis = 1)
    df_oszi.set_index(0, inplace = True)    

    #ERROR APPEARS with xticks argument
    #removing xticks does not help, because then errorpath changes to
    #/usr/local/lib/python3.6/dist-packages/pandas/plotting/_core.py
    df_oszi.plot(grid = 1,
                 color = 'blue',
                 linewidth = 0.5,
                 figsize = (9,5),
                 legend = False,
                 xticks = np.arange(df_oszi.index[0], df_oszi.index[-1], 1))

    print('plotting file ' + str(filenumber))
    plt.savefig('Plot_' + str(filenumber) + '.png', dpi = 300)

问题似乎出在外部模块readTrc上.我花了相当长的时间才弄清楚这一点,因为python在MatplotlibPandas而不是readTrc周围抛出了错误,这似乎是读取.trc文件的非官方脚本.我在网上寻找它的原因是我正在寻找一种方法来读取python中的.trc文件.如果您知道读取示波器文件的更好方法,请告诉我.

我将执行脚本所需的所有内容压缩到此文件夹中:文件夹

(相当大的582MB,因为每个.trc文件的大小约为200MB)在内部,您可以找到脚本,一个包含.trc文件的文件夹和一个外部python文件(模块),这是读取.trc文件所必需的.执行脚本应该绘制第一个文件,但是在绘制/构造第二个文件时至少在我的Ubuntu机器上抛出MemoryError.令我感到困惑的是,我只能在 Ubuntu (18.04)上获得此MemoryError,而不是在 Windows 10 上获得.

我将非常感谢您的帮助,以便我可以继续进行我的项目.如果您需要其他信息,请告诉我.

readTrc.py

的单个下载

Script.py

的单个下载

print(type(datX))返回:

<class 'numpy.ndarray'>

打印datX返回具有 5000万值的对象:

[-0.005 -0.005 -0.005 ...  0.005  0.005  0.005]

这些通过print()函数是四舍五入的,并且是:

-0.004999999906663635
-0.004999999806663634
-0.004999999706663633
-0.004999999606663631
-0.00499999950666363

编辑2 :

要使用新版本的readTrc运行代码,请进行以下更改:

import os
import pandas as pd
import numpy as np
import matplotlib.pyplot as plt
import readTrc

foldername = 'trc_folder'
folder = os.listdir(foldername)
path = os.path.dirname(os.path.realpath(__file__))

for filenumber, i in enumerate(folder):
    trc = path + '/' + foldername + '/' + i

    print('reading trc file ' + str(filenumber))

    datX, datY, d = readTrc.Trc().open(trc)
    srx, sry = pd.Series(datX * 1000), pd.Series(datY * 1000)
    df_oszi = pd.concat([srx, sry], axis = 1)
    df_oszi.set_index(0, inplace = True)    

    df_oszi.plot(grid = 1,
                 color = 'blue',
                 linewidth = 0.5,
                 figsize = (9,5),
                 legend = False,
                 xticks = np.arange(df_oszi.index[0], df_oszi.index[-1], 1))

    print('plotting file ' + str(filenumber))
    plt.savefig('Plot_' + str(filenumber) + '.png', dpi = 300)

内存错误:

Traceback (most recent call last):
  File "/home/artur/Desktop/zip_original/Script.py", line 27, in <module>
    xticks = np.arange(df_oszi.index[0], df_oszi.index[-1], 1))
  File "/usr/local/lib/python3.6/dist-packages/pandas/plotting/_core.py", line 2941, in __call__
    sort_columns=sort_columns, **kwds)
  File "/usr/local/lib/python3.6/dist-packages/pandas/plotting/_core.py", line 1977, in plot_frame
    **kwds)
  File "/usr/local/lib/python3.6/dist-packages/pandas/plotting/_core.py", line 1804, in _plot
    plot_obj.generate()
  File "/usr/local/lib/python3.6/dist-packages/pandas/plotting/_core.py", line 260, in generate
    self._make_plot()
  File "/usr/local/lib/python3.6/dist-packages/pandas/plotting/_core.py", line 985, in _make_plot
    **kwds)
  File "/usr/local/lib/python3.6/dist-packages/pandas/plotting/_core.py", line 1001, in _plot
    lines = MPLPlot._plot(ax, x, y_values, style=style, **kwds)
  File "/usr/local/lib/python3.6/dist-packages/pandas/plotting/_core.py", line 615, in _plot
    return ax.plot(*args, **kwds)
  File "/usr/local/lib/python3.6/dist-packages/matplotlib/__init__.py", line 1805, in inner
    return func(ax, *args, **kwargs)
  File "/usr/local/lib/python3.6/dist-packages/matplotlib/axes/_axes.py", line 1604, in plot
    self.add_line(line)
  File "/usr/local/lib/python3.6/dist-packages/matplotlib/axes/_base.py", line 1891, in add_line
    self._update_line_limits(line)
  File "/usr/local/lib/python3.6/dist-packages/matplotlib/axes/_base.py", line 1913, in _update_line_limits
    path = line.get_path()
  File "/usr/local/lib/python3.6/dist-packages/matplotlib/lines.py", line 945, in get_path
    self.recache()
  File "/usr/local/lib/python3.6/dist-packages/matplotlib/lines.py", line 649, in recache
    self._xy = np.column_stack(np.broadcast_arrays(x, y)).astype(float)
MemoryError

修改3:

对数据集进行采样似乎会减少数据值.这些是与sampling = 1, sampling = 10, sampling = 100

相同的数据集的示例

srx, sry = pd.Series(datX[::sampling] * 1000), pd.Series(datY[::sampling] * 1000)

其原因是超高频波(UHF)的脉冲周期极短.每个脉冲只能由几个数据值组成.如果您降低了要考虑的值的数量,则会导致大量数据丢失.尽管此解决方案可以使代码正常工作,但它也会大大减少数据值.

解决方案

花了很多时间,但我设法控制了MemoryError.我不仅要在每个循环的末尾放置gc.collect(),而且还要将plt.close()放在末尾.只有这样,错误才会停止.对困惑感到抱歉.我从中学到了很多.

I am trying to read large oscilloscope .trc files and plot them. Plotting one file works but as soon as I put the script into a loop, trying to plot all files (1 file a loop) I am getting a MemoryError.

Code:

import os
import pandas as pd
import numpy as np
import matplotlib.pyplot as plt
import readTrc #external file, same location as script

foldername = 'trc_folder'
folder = os.listdir(foldername)
path = os.path.dirname(os.path.realpath(__file__))

for filenumber, i in enumerate(folder):
    trc = path + '/' + foldername + '/' + i

    print('reading trc file ' + str(filenumber))

    datX, datY, m = readTrc.readTrc(trc)
    srx, sry = pd.Series(datX * 1000), pd.Series(datY * 1000)
    df_oszi = pd.concat([srx, sry], axis = 1)
    df_oszi.set_index(0, inplace = True)    

    #ERROR APPEARS with xticks argument
    #removing xticks does not help, because then errorpath changes to
    #/usr/local/lib/python3.6/dist-packages/pandas/plotting/_core.py
    df_oszi.plot(grid = 1,
                 color = 'blue',
                 linewidth = 0.5,
                 figsize = (9,5),
                 legend = False,
                 xticks = np.arange(df_oszi.index[0], df_oszi.index[-1], 1))

    print('plotting file ' + str(filenumber))
    plt.savefig('Plot_' + str(filenumber) + '.png', dpi = 300)

The problem seems to be with the external module readTrc. It took me quite a while to figure this out because python was throwing errors around Matplotlib and Pandas rather than readTrc, which seems to be an unofficial script for reading .trc files. I found it on the net as I was looking for a way to read .trc files in python. If you know a better way for reading oscilloscope files, please let me know.

I zipped everything you need to execute the script to this folder: folder

(It is quite large 582MB, because every .trc file is about of 200MB size) Inside you will find the script, a folder with .trc files and the external python file (module) readTrc which is required for reading .trc files. Executing the script should plot the first file but throw a MemoryError when plotting/constructing the second, at least on my Ubuntu machine. What confuses me is that I only get this MemoryError on Ubuntu (18.04), not on Windows 10.

I would appreciate help so that I can continue with my project. Please let me know, if you need additional information.

Edit:

Single Download for readTrc.py

Single Download for Script.py

print(type(datX)) returns:

<class 'numpy.ndarray'>

printing datX returns an object with 50 million values:

[-0.005 -0.005 -0.005 ...  0.005  0.005  0.005]

these are round by the print() function and are:

-0.004999999906663635
-0.004999999806663634
-0.004999999706663633
-0.004999999606663631
-0.00499999950666363

Edit 2:

To run the code with the new version of readTrc make these changes:

import os
import pandas as pd
import numpy as np
import matplotlib.pyplot as plt
import readTrc

foldername = 'trc_folder'
folder = os.listdir(foldername)
path = os.path.dirname(os.path.realpath(__file__))

for filenumber, i in enumerate(folder):
    trc = path + '/' + foldername + '/' + i

    print('reading trc file ' + str(filenumber))

    datX, datY, d = readTrc.Trc().open(trc)
    srx, sry = pd.Series(datX * 1000), pd.Series(datY * 1000)
    df_oszi = pd.concat([srx, sry], axis = 1)
    df_oszi.set_index(0, inplace = True)    

    df_oszi.plot(grid = 1,
                 color = 'blue',
                 linewidth = 0.5,
                 figsize = (9,5),
                 legend = False,
                 xticks = np.arange(df_oszi.index[0], df_oszi.index[-1], 1))

    print('plotting file ' + str(filenumber))
    plt.savefig('Plot_' + str(filenumber) + '.png', dpi = 300)

MemoryError:

Traceback (most recent call last):
  File "/home/artur/Desktop/zip_original/Script.py", line 27, in <module>
    xticks = np.arange(df_oszi.index[0], df_oszi.index[-1], 1))
  File "/usr/local/lib/python3.6/dist-packages/pandas/plotting/_core.py", line 2941, in __call__
    sort_columns=sort_columns, **kwds)
  File "/usr/local/lib/python3.6/dist-packages/pandas/plotting/_core.py", line 1977, in plot_frame
    **kwds)
  File "/usr/local/lib/python3.6/dist-packages/pandas/plotting/_core.py", line 1804, in _plot
    plot_obj.generate()
  File "/usr/local/lib/python3.6/dist-packages/pandas/plotting/_core.py", line 260, in generate
    self._make_plot()
  File "/usr/local/lib/python3.6/dist-packages/pandas/plotting/_core.py", line 985, in _make_plot
    **kwds)
  File "/usr/local/lib/python3.6/dist-packages/pandas/plotting/_core.py", line 1001, in _plot
    lines = MPLPlot._plot(ax, x, y_values, style=style, **kwds)
  File "/usr/local/lib/python3.6/dist-packages/pandas/plotting/_core.py", line 615, in _plot
    return ax.plot(*args, **kwds)
  File "/usr/local/lib/python3.6/dist-packages/matplotlib/__init__.py", line 1805, in inner
    return func(ax, *args, **kwargs)
  File "/usr/local/lib/python3.6/dist-packages/matplotlib/axes/_axes.py", line 1604, in plot
    self.add_line(line)
  File "/usr/local/lib/python3.6/dist-packages/matplotlib/axes/_base.py", line 1891, in add_line
    self._update_line_limits(line)
  File "/usr/local/lib/python3.6/dist-packages/matplotlib/axes/_base.py", line 1913, in _update_line_limits
    path = line.get_path()
  File "/usr/local/lib/python3.6/dist-packages/matplotlib/lines.py", line 945, in get_path
    self.recache()
  File "/usr/local/lib/python3.6/dist-packages/matplotlib/lines.py", line 649, in recache
    self._xy = np.column_stack(np.broadcast_arrays(x, y)).astype(float)
MemoryError

Edit 3:

Sampling the dataset seems to reduce the data value. These are examples of the same dataset with sampling = 1, sampling = 10, sampling = 100

srx, sry = pd.Series(datX[::sampling] * 1000), pd.Series(datY[::sampling] * 1000)

The reason for this is the extremely short impulse period of Ultra High Frequency waves (UHF). Each impulse can be consisting of only a few data values. If you set down the amount of values taken into account, this results in large data loss. Although this solution makes the code work, it also reduces the data value significantly.

解决方案

It took quite some time but I managed to get the MemoryError under control. Not only had I to put gc.collect() at the end of each loop but also plt.close(). Only then the Errors would stop. Sorry for the confusion. I learned a lot from this.

这篇关于在Ubuntu上绘制大型示波器文件时出现MemoryError的文章就介绍到这了,希望我们推荐的答案对大家有所帮助,也希望大家多多支持IT屋!

查看全文
登录 关闭
扫码关注1秒登录
发送“验证码”获取 | 15天全站免登陆