我应该使用data.frame还是matrix? [英] Should I use a data.frame or a matrix?
问题描述
什么时候应该使用 data.frame
,而最好使用矩阵
? p>
两者都保留矩形格式的数据,所以有时候还不清楚。
是否有任何一般的经验法则使用哪些数据类型?
部分答案已经包含在您的问题中:如果列(变量) )可以被预期为不同类型(数字/字符/逻辑等)。矩阵用于相同类型的数据。
因此,如果您的数据具有相同的类型,则选择矩阵/数据框只会有问题。
答案取决于你将要对data.frame / matrix中的数据做什么。如果要传递给其他函数,那么这些函数的参数的预期类型决定了选择。
另外:
矩阵更有效地记忆:
m =矩阵(1:4,2,2)
d = as.data.frame(m)
object.size(m)
#216 bytes
object.size(d)
#792 bytes
如果您打算执行任何线性代数类型的操作,矩阵是必需的。
如果您经常通过名称(通过compact $运算符)引用其列,数据帧将更加方便。
数据帧也更好报告(打印)表格信息,您可以单独应用格式化到每列。
When should one use a data.frame
, and when is it better to use a matrix
?
Both keep data in a rectangular format, so sometimes it's unclear.
Are there any general rules of thumb for when to use which data type?
Part of the answer is contained already in your question: You use data frames if columns (variables) can be expected to be of different types (numeric/character/logical etc.). Matrices are for data of the same type.
Consequently, the choice matrix/data.frame is only problematic if you have data of the same type.
The answer depends on what you are going to do with the data in data.frame/matrix. If it is going to be passed to other functions then the expected type of the arguments of these functions determine the choice.
Also:
Matrices are more memory efficient:
m = matrix(1:4, 2, 2)
d = as.data.frame(m)
object.size(m)
# 216 bytes
object.size(d)
# 792 bytes
Matrices are a necessity if you plan to do any linear algebra-type of operations.
Data frames are more convenient if you frequently refer to its columns by name (via the compact $ operator).
Data frames are also IMHO better for reporting (printing) tabular information as you can apply formatting to each column separately.
这篇关于我应该使用data.frame还是matrix?的文章就介绍到这了,希望我们推荐的答案对大家有所帮助,也希望大家多多支持IT屋!