为大型数据集加速 plot() 函数 [英] Speed up plot() function for large dataset
问题描述
我将 plot()
用于超过 100 万个数据点,结果证明速度非常慢.
有没有办法提高速度,包括编程和硬件解决方案(更多内存、显卡……)?
绘图数据存储在哪里?
(这个问题与
I am using plot()
for over 1 mln data points and it turns out to be very slow.
Is there any way to improve the speed including programming and hardware solutions (more RAM, graphic card...)?
Where are data for plot stored?
(This question is closely related to Scatterplot with too many points, although that question focuses on the difficulty of seeing anything in the big scatterplot rather than on performance issues ...)
A hexbin plot actually shows you something (unlike the scatterplot @Roland proposes in the comments, which is likely to just be a giant, slow, blob) and takes about 3.5 seconds on my machine for your example:
set.seed(101)
a<-rnorm(1E7,1,1)
b<-rnorm(1E7,1,1)
library(hexbin)
system.time(plot(hexbin(a,b))) ## 0.5 seconds, modern laptop
Another, slightly slower alternative is the base-R smoothScatter
function: it plots a smooth density plus as many extreme points as requested (1000 in this case).
system.time(smoothScatter(a,b,cex=4,nr=1000)) ## 3.3 seconds
这篇关于为大型数据集加速 plot() 函数的文章就介绍到这了,希望我们推荐的答案对大家有所帮助,也希望大家多多支持IT屋!