使用NA默认条目创建(和访问)稀疏矩阵 [英] Creating (and Accessing) a Sparse Matrix with NA default entries

查看:194
本文介绍了使用NA默认条目创建(和访问)稀疏矩阵的处理方法,对大家解决问题具有一定的参考价值,需要的朋友们下面随着小编来一起学习吧!

问题描述

了解了用于在R中使用稀疏矩阵的选项之后>,我想使用 Matrix 包创建一个接下来的数据帧中的矩阵稀疏,并且所有其他元素为NA.

After learning about the options for working with sparse matrices in R, I want to use the Matrix package to create a sparse matrix from the following data frame and have all other elements be NA.

     s    r d
1 1089 3772 1
2 1109  190 1
3 1109 2460 1
4 1109 3071 2
5 1109 3618 1
6 1109   38 7

我知道我可以使用以下内容创建稀疏矩阵,并照常访问元素:

I know I can create a sparse matrix with the following, accessing elements as usual:

> library(Matrix)
> Y <- sparseMatrix(s,r,x=d)
> Y[1089,3772]
[1] 1
> Y[1,1]
[1] 0

但是,如果我想将默认值设置为NA,则可以尝试以下操作:

but if I want to have the default value to be NA, I tried the following:

  M <- Matrix(NA,max(s),max(r),sparse=TRUE)
  for (i in 1:nrow(X))
    M[s[i],r[i]] <- d[i]

并收到此错误

Error in checkSlotAssignment(object, name, value) : 
  assignment of an object of class "numeric" is not valid for slot "x" in an object of class "lgCMatrix"; is(value, "logical") is not TRUE

不仅如此,我发现访问元素需要更长的时间.

Not only that, I find that one takes much longer to access to elements.

> system.time(Y[3,3])
   user  system elapsed 
  0.000   0.000   0.003 
> system.time(M[3,3])
   user  system elapsed 
  0.660   0.032   0.995 

我应该如何创建此矩阵?为什么一个矩阵处理起来这么慢?

How should I be creating this matrix? Why is one matrix so much slower to work with?

以下是上述数据的代码段:

Here's the code snippet for the above data:

X <- structure(list(s = c(1089, 1109, 1109, 1109, 1109, 1109), r = c(3772, 
190, 2460, 3071, 3618, 38), d = c(1, 1, 1, 2, 1, 7)), .Names = c("s", 
"r", "d"), row.names = c(NA, 6L), class = "data.frame")

推荐答案

是的,Thierry的回答绝对正确,我可以说是矩阵"软件包的合著者...

Yes, Thierry's answer is definitely true I can say as co-author of the 'Matrix' package...

对于您的另一个问题:为什么访问"M"比"Y"要慢? 主要答案是,"M"比"Y"稀疏得多,因此要小得多,并且-根据所涉及的大小和平台的RAM-对于小得多的对象(尤其是索引到它们的对象),访问时间更快.

To your other question: Why is accessing "M" slower than "Y"? The main answer is that "M" is much much sparser than "Y" hence much smaller and -- depending on the sizes envolved and the RAM of your platform -- the access time is faster for much smaller objects, notably for indexing into them.

这篇关于使用NA默认条目创建(和访问)稀疏矩阵的文章就介绍到这了,希望我们推荐的答案对大家有所帮助,也希望大家多多支持IT屋!

查看全文
登录 关闭
扫码关注1秒登录
发送“验证码”获取 | 15天全站免登陆