Matplotlib情节过慢 [英] Matplotlib plot excessively slow

查看:53
本文介绍了Matplotlib情节过慢的处理方法,对大家解决问题具有一定的参考价值,需要的朋友们下面随着小编来一起学习吧!

问题描述

我正在尝试绘制 2000 万个数据点,但是使用 matplotlib 需要花费很长时间(一个多小时),

I'm trying to plot 20 million data points however it's taking an extremely long time (over an hour) using matplotlib,

我的代码中是否有某些东西使这异常缓慢?

Is there something in my code that is making this unusually slow?

import csv
import matplotlib.pyplot as plt
import numpy as np
import Tkinter
from Tkinter import *
import tkSimpleDialog
from tkFileDialog import askopenfilename

plt.clf()

root = Tk()
root.withdraw() 
listofparts = askopenfilename()                  # asks user to select file
root.destroy()

my_list1 = []
my_list2 = []
k = 0

csv_file = open(listofparts, 'rb')

for line in open(listofparts, 'rb'):
    current_part1 = line.split(',')[0]
    current_part2 = line.split(',')[1]
    k = k + 1
    if k >= 2:                                   # skips the first line
        my_list1.append(current_part1)
        my_list2.append(current_part2)

csv_file.close()

plt.plot(my_list1 * 10, 'r')
plt.plot(my_list2 * 10, 'g')

plt.show()
plt.close()

推荐答案

没有任何理由在matplotlib中具有20000000点的线图.

There is no reason whatsoever to have a line plot of 20000000 points in matplotlib.

让我们先考虑打印:matplotlib 中的最大图形大小为 50 英寸.即使拥有 3600 dpi 的高科技绘图仪,其最大可解析点数为 50*3600 = 180000 点.

Let's consider printing first: The maximum figure size in matplotlib is 50 inch. Even having a high-tech plotter with 3600 dpi would give a maximum number of 50*3600 = 180000 points which are resolvable.

对于屏幕应用程序甚至更少:即使是高科技4k屏幕也只能具有4000像素的有限分辨率.即使使用混叠效果,每个像素最多也有大约 3 个点,人眼仍然可以区分.结果:最多 12000 点是有意义的.

For screen applications it's even less: Even a high-tech 4k screen has a limited resolution of 4000 pixels. Even if one uses aliasing effects, there are a maximum of ~3 points per pixel that would still be distinguishable for the human eye. Result: maximum of 12000 points makes sense.

因此,您更应该问的问题是:如何将我的 20000000 个数据点子采样为一组仍能在纸上或屏幕上生成相同图像的点.

Therefore the question you are asking rather needs to be: How do I subsample my 20000000 data points to a set of points that still produces the same image on paper or screen.

对此的解决方案在很大程度上取决于数据的性质.如果它足够平滑,您可以只取每个 n 列表条目.

The solution to this strongly depends on the nature of the data. If it is sufficiently smooth, you can just take every nth list entry.

sample = data[::n]

如果需要解决高频分量,则将需要更复杂的技术,这将再次取决于数据的外观.

If there are high frequency components which need to be resolved, this would require more sophisticated techniques, which will again depend on how the data looks like.

一种这样的技术可能是

One such technique might be the one shown in How can I subsample an array according to its density? (Remove frequent values, keep rare ones).

这篇关于Matplotlib情节过慢的文章就介绍到这了,希望我们推荐的答案对大家有所帮助,也希望大家多多支持IT屋!

查看全文
登录 关闭
扫码关注1秒登录
发送“验证码”获取 | 15天全站免登陆