R中的数据探索:快速显示大型矩阵的热图? [英] Data exploration in R: display heatmap of large matrix, quickly?
问题描述
如何快速可视化R中的大型矩阵?
我有时会处理大型数字矩阵(例如3000 x 3000),并且快速可视化它们是非常有用的质量控制步骤.在我以前选择的语言Matlab中,这非常容易且快速.例如,显示1000x1000矩阵需要0.5秒:
I sometimes work with large-ish numeric matrices (e.g. 3000 x 3000), and quickly visualizing them is a very helpful quality control step. This was very easy and fast in Matlab, my previous language of choice. For example, it takes 0.5 seconds to display a 1000x1000 matrix:
rand_matrix = rand(1000,1000);
tic
imagesc(rand_matrix)
toc
>> Elapsed time is 0.463903 seconds.
我希望R中具有相同的幂,但是不幸的是在R中可视化矩阵似乎很慢.例如,使用image.plot()
相同的随机矩阵需要花费10秒钟以上的时间来显示:
I'd like the same powers in R, but unfortunately visualizing matrices seems very slow in R. For example, using image.plot()
the same random matrix takes more than 10 seconds to display:
require(tictoc)
require(image.plot)
mm = 1000
nn = 1000
rand.matrix = matrix(runif(mm*nn), ncol=mm, nrow=nn)
tic("Visualizing matrix")
image.plot(rand.matrix)
toc()
> Visualizing matrix: 11.744 sec elapsed
随着矩阵变大,问题变得更糟.例如,一个3000x3000的矩阵在R中可视化需要几分钟,而在Matlab中则需要数秒.显然,这对于数据探索确实不起作用.我已经尝试过ggplot,但融化+ geom_raster()仍可能需要一分钟.
The problem gets worse as the matrices get bigger. For example, a 3000x3000 matrix takes minutes to visualize in R, compared to seconds in Matlab. This obviously doesn't really work for data exploration. I've tried ggplot, and melting + geom_raster() can still take up to a minute.
我在做什么错?有没有一种快速的方法可以在R中可视化矩阵?一种理想的解决方案是使用一两行.
What am I doing wrong? Is there a fast way to visualize matrices in R? An ideal solution would take one or two lines.
推荐答案
使用image(m, useRaster = TRUE)
时,我会很快得到一个图:
I get a plot pretty quickly when using image(m, useRaster = TRUE)
:
start = Sys.time()
image(rand.matrix, useRaster = TRUE)
print(Sys.time() - start)
# Time difference of 0.326 secs
没有useRaster = TRUE
会花费1.5秒,useRaster
会加快速度,但仅适用于我认为简单,均匀间隔的点.
Without useRaster = TRUE
this takes 1.5 seconds, useRaster
speeds this up but only works for simple, evenly spaced points I think.
如果您的最终目标是使用此图生成图像文件,那么我认为直接输出为png这样的栅格格式可能是最有效的方法,尽管要准确地计算R花费多长时间会有些棘手图片文件,例如:
If your ultimate goal is to produce an image file with this plot, then I think it might be most efficient to output directly to a raster format like png, although it's a little tricky to measure exactly how long R is taking to save the image file, e.g.:
png("image_plot.png", width = 1000, height = 1000)
image(rand.matrix, useRaster = TRUE)
dev.off()
这篇关于R中的数据探索:快速显示大型矩阵的热图?的文章就介绍到这了,希望我们推荐的答案对大家有所帮助,也希望大家多多支持IT屋!