如何操作/访问“dist”实例的元素类使用核心R? [英] How do I manipulate/access elements of an instance of "dist" class using core R?

查看:130
本文介绍了如何操作/访问“dist”实例的元素类使用核心R?的处理方法,对大家解决问题具有一定的参考价值,需要的朋友们下面随着小编来一起学习吧!

问题描述

R中的基本/公共类称为dist,并且是对称距离矩阵的相对有效的表示。与matrix对象不同,似乎不支持通过dist例如,以下代码不返回任何内容,<$ c $

c> NULL 或一个错误:

 #首先,从矩阵创建一个示例dist对象
mat1 < - matrix(1:100,10,10)
rownames(mat1)< - 1:10
colnames(mat1)< - 1:10
dist1 < - as.dist(mat1)
#现在尝试访问索引特征或索引值
names(dist1)
rownames(dist1)
row.names )
colnames(dist1)
col.names(dist1)
dist1 [1,2]

同时,以下命令在某种意义上起作用,但不会使访问/操作特定的索引对值更容易:

  dist1 [1]#R认为它是一个向量,而不是一个矩阵? 
属性(dist1)
属性(dist1)$ Diag< - FALSE
mat2< - as(dist1,matrix)
mat2 [1,2] - 0

一个解决方法 - 我想避免 - 是先转换dist对象到matrix,操作该矩阵,然后将其转换回 dist。也就是说,这不是一个 实例转换为矩阵其他已经定义了常用矩阵索引工具的类;因为在不同的SO问题



stats 包中有工具R包)专用索引/访问dist实例的元素

解决方案

我没有直接回答你的问题,但如果你使用欧几里德距离,从字段包中的 rdist 函数。它的实现(在Fortran中)比 dist 更快,输出是 matrix 的输出。至少,它显示一些开发人员已经选择远离这个 dist 类,也许是因为你提到的确切原因。如果你担心使用完整的矩阵来存储对称矩阵是对内存的低效使用,你可以将它转换为三角矩阵。

 库(fields)
points< - matrix(runif(1000 * 100),nrow = 1000,ncol = 100)

system.time(dist1 < - dist(points))
#用户系统已经过
#7.277 0.000 7.338

system.time(dist2< - rdist(points))
#用户系统已过
#2.756 0.060 2.851

class(dist2)
#[1]matrix
dim (dist2)
#[1] 1000 1000
dist2 [1:3,1:3]
#[,1] [,2] 1,] 0.0000000001 3.9529674733 3.8051198575
#[2,] 3.9529674733 0.0000000001 3.6552146293
#[3,] 3.8051198575 3.6552146293 0.0000000001


A basic/common class in R is called "dist", and is a relatively efficient representation of a symmetric distance matrix. Unlike a "matrix" object, however, there does not seem to be support for manipulating an "dist" instance by index pairs using the "[" operator.

For example, the following code returns nothing, NULL, or an error:

# First, create an example dist object from a matrix
mat1  <- matrix(1:100, 10, 10)
rownames(mat1) <- 1:10
colnames(mat1) <- 1:10
dist1 <- as.dist(mat1)
# Now try to access index features, or index values
names(dist1)
rownames(dist1)
row.names(dist1)
colnames(dist1)
col.names(dist1)
dist1[1, 2]

Meanwhile, the following commands do work, in some sense, but do not make it any easier to access/manipulate particular index-pair values:

dist1[1] # R thinks of it as a vector, not a matrix?
attributes(dist1)
attributes(dist1)$Diag <- FALSE
mat2 <- as(dist1, "matrix")
mat2[1, 2] <- 0

A workaround -- that I want to avoid -- is to first convert the "dist" object to a "matrix", manipulate that matrix, and then convert it back to "dist". That is also to say, this is not a question about how to convert a "dist" instance into a "matrix", or some other class where common matrix-indexing tools are already defined; since this has been answered in several ways in a different SO question

Are there tools in the stats package (or perhaps some other core R package) dedicated indexing/accessing elements of an instance of "dist"?

解决方案

I don't have a straight answer to your question, but if you are using the Euclidian distance, have a look at the rdist function from the fields package. Its implementation (in Fortran) is faster than dist, and the output is of class matrix. At the very least, it shows that some developers have chosen to move away from this dist class, maybe for the exact reason you are mentioning. If you are concerned that using a full matrix for storing a symmetric matrix is an inefficient use of memory, you could convert it to a triangular matrix.

library("fields")
points <- matrix(runif(1000*100), nrow=1000, ncol=100)

system.time(dist1 <- dist(points))
#    user  system elapsed 
#   7.277   0.000   7.338 

system.time(dist2 <- rdist(points))
#   user  system elapsed 
#  2.756   0.060   2.851 

class(dist2)
# [1] "matrix"
dim(dist2)
# [1] 1000 1000
dist2[1:3, 1:3]
#              [,1]         [,2]         [,3]
# [1,] 0.0000000001 3.9529674733 3.8051198575
# [2,] 3.9529674733 0.0000000001 3.6552146293
# [3,] 3.8051198575 3.6552146293 0.0000000001

这篇关于如何操作/访问“dist”实例的元素类使用核心R?的文章就介绍到这了,希望我们推荐的答案对大家有所帮助,也希望大家多多支持IT屋!

查看全文
登录 关闭
扫码关注1秒登录
发送“验证码”获取 | 15天全站免登陆