在R中绘制巨大的数据文件? [英] Plotting huge data files in R?

查看:147
本文介绍了在R中绘制巨大的数据文件?的处理方法,对大家解决问题具有一定的参考价值,需要的朋友们下面随着小编来一起学习吧!

问题描述

我有一个拥有大约2000万行的输入文件。该文件的大小约为1.2 G.无论如何,我可以在R中绘制数据。某些列有类别,其中大多数是数字。

我已经尝试了我的绘图脚本,它包含大约800K行的输入文件的一小部分,但即使我拥有大约8G的RAM,但似乎无法绘制所有数据。有没有简单的方法来做到这一点。

解决方案

如果没有您想要的那种情节的更加清晰的描述,给出具体的建议。但是,一般情况下,不需要在情节中绘制2000万个点。例如时间序列可以用样条拟合或某种平均值来表示,例如,将每小时的数据汇总为日平均值。或者,您可以绘制一些数据子集,例如在时间序列的例子中,每天只有一个点。所以我认为你的挑战不是在情节上获得20M点甚至800K,而是如何有效地聚合你的数据,以传达你想告诉的信息。


I have a input file that has about 20 million lines. The size of the file is about 1.2 G. Is there anyway I can plot the data in R. Some of the columns have categories, most of them are numbers.

I have tried my plotting script with a small subset of the input file about 800K lines, but even though i have about 8G of RAM, I dont seem to be able to plot all the data. Is there any simple way to do this.

解决方案

Without a more clear description of the kind of plot you want, it is hard to give concrete suggestions. However, in general there is no need to plot 20 million points in a plot. For example a timeseries could be represented by a splines fit, or some kind of average, e.g. aggregate hourly data to daily averages. Alternatively, you draw some subset of the data, e.g. only one point per day in the example of the timeseries. So I think your challenge is not as much getting 20M points, or even 800k, on a plot, but how to aggregate your data effectively in such a way that it conveys the message you want to tell.

这篇关于在R中绘制巨大的数据文件?的文章就介绍到这了,希望我们推荐的答案对大家有所帮助,也希望大家多多支持IT屋!

查看全文
登录 关闭
扫码关注1秒登录
发送“验证码”获取 | 15天全站免登陆