宽格式或长格式数据更有效吗? [英] Is the wide or long format data more efficient?

查看:168
本文介绍了宽格式或长格式数据更有效吗?的处理方法,对大家解决问题具有一定的参考价值,需要的朋友们下面随着小编来一起学习吧!

问题描述

我很好奇,无论采用解释性格式还是以长格式或宽格式存储数据,效率更高?我已经使用object.size()来确定内存的大小,但是它们并没有显着差异(长期以来,在大小方面稍有提高),并且该值仅是估算值.

I am just curious whether it is more efficient to store data in long or wide format regardless of the interpretative? I have used object.size() to determine the size in the memory but they do not differ significantly (the long being slightly more efficient in terms of size) and the value is only and estimate.

除了原始大小外,我还想知道哪种格式在建模时可以更有效地操作.

On top of the raw size, I am also wondering which of the format is more efficient in terms of being manipulated when used in modelling.

推荐答案

两个不同的matrix es的内存使用情况应该相同:

The memory usage of the two different matrixes should be identical:

> object.size(long <- matrix(seq(10000), nrow = 1000))
40200 bytes
> object.size(square <- matrix(seq(10000), nrow = 100))
40200 bytes

任何效率上的差异都将因使用R的效率低下而相形见,,因此即使可以测量,也几乎不需要考虑.

Any differences in efficiency will be dwarfed by the inefficiency in using R, so hardly need to be considered, if they are even measurable.

data.frame的情况非常不同,因为它被实现为vector s的list:

The situation is very different for a data.frame, since it is implemented as a list of vectors:

> object.size(as.data.frame(long))
41704 bytes
> object.size(as.data.frame(square))
50968 bytes

此操作的时间效率取决于您实际要执行的操作.

The time efficiency of this will depend on what exactly you want to do.

这篇关于宽格式或长格式数据更有效吗?的文章就介绍到这了,希望我们推荐的答案对大家有所帮助,也希望大家多多支持IT屋!

查看全文
登录 关闭
扫码关注1秒登录
发送“验证码”获取 | 15天全站免登陆