如何操作/访问“dist”实例的元素类使用核心R? [英] How do I manipulate/access elements of an instance of "dist" class using core R?
问题描述
R中的基本/公共类称为dist
,并且是对称距离矩阵的相对有效的表示。与matrix
对象不同,似乎不支持通过dist
例如,以下代码不返回任何内容,<$ c $
#首先,从矩阵创建一个示例dist对象
mat1 < - matrix(1:100,10,10)
rownames(mat1)< - 1:10
colnames(mat1)< - 1:10
dist1 < - as.dist(mat1)
#现在尝试访问索引特征或索引值
names(dist1)
rownames(dist1)
row.names )
colnames(dist1)
col.names(dist1)
dist1 [1,2]
同时,以下命令在某种意义上起作用,但不会使访问/操作特定的索引对值更容易:
dist1 [1]#R认为它是一个向量,而不是一个矩阵?
属性(dist1)
属性(dist1)$ Diag< - FALSE
mat2< - as(dist1,matrix)
mat2 [1,2] - 0
一个解决方法 - 我想避免 - 是先转换dist
对象到matrix
,操作该矩阵,然后将其转换回 dist
。也就是说,这不是一个 实例转换为矩阵
其他已经定义了常用矩阵索引工具的类;因为在不同的SO问题
stats
包中有工具R包)专用索引/访问dist实例的元素
?
我没有直接回答你的问题,但如果你使用欧几里德距离,从字段
包中的 rdist
函数。它的实现(在Fortran中)比 dist
更快,输出是 matrix
的输出。至少,它显示一些开发人员已经选择远离这个 dist
类,也许是因为你提到的确切原因。如果你担心使用完整的矩阵
来存储对称矩阵是对内存的低效使用,你可以将它转换为三角矩阵。
库(fields)
points< - matrix(runif(1000 * 100),nrow = 1000,ncol = 100)
system.time(dist1 < - dist(points))
#用户系统已经过
#7.277 0.000 7.338
system.time(dist2< - rdist(points))
#用户系统已过
#2.756 0.060 2.851
class(dist2)
#[1]matrix
dim (dist2)
#[1] 1000 1000
dist2 [1:3,1:3]
#[,1] [,2] 1,] 0.0000000001 3.9529674733 3.8051198575
#[2,] 3.9529674733 0.0000000001 3.6552146293
#[3,] 3.8051198575 3.6552146293 0.0000000001
A basic/common class in R is called "dist"
, and is a relatively efficient representation of a symmetric distance matrix. Unlike a "matrix"
object, however, there does not seem to be support for manipulating an "dist"
instance by index pairs using the "["
operator.
For example, the following code returns nothing, NULL
, or an error:
# First, create an example dist object from a matrix
mat1 <- matrix(1:100, 10, 10)
rownames(mat1) <- 1:10
colnames(mat1) <- 1:10
dist1 <- as.dist(mat1)
# Now try to access index features, or index values
names(dist1)
rownames(dist1)
row.names(dist1)
colnames(dist1)
col.names(dist1)
dist1[1, 2]
Meanwhile, the following commands do work, in some sense, but do not make it any easier to access/manipulate particular index-pair values:
dist1[1] # R thinks of it as a vector, not a matrix?
attributes(dist1)
attributes(dist1)$Diag <- FALSE
mat2 <- as(dist1, "matrix")
mat2[1, 2] <- 0
A workaround -- that I want to avoid -- is to first convert the "dist"
object to a "matrix"
, manipulate that matrix, and then convert it back to "dist"
. That is also to say, this is not a question about how to convert a "dist"
instance into a "matrix"
, or some other class where common matrix-indexing tools are already defined; since this has been answered in several ways in a different SO question
Are there tools in the stats
package (or perhaps some other core R package) dedicated indexing/accessing elements of an instance of "dist"
?
I don't have a straight answer to your question, but if you are using the Euclidian distance, have a look at the rdist
function from the fields
package. Its implementation (in Fortran) is faster than dist
, and the output is of class matrix
. At the very least, it shows that some developers have chosen to move away from this dist
class, maybe for the exact reason you are mentioning. If you are concerned that using a full matrix
for storing a symmetric matrix is an inefficient use of memory, you could convert it to a triangular matrix.
library("fields")
points <- matrix(runif(1000*100), nrow=1000, ncol=100)
system.time(dist1 <- dist(points))
# user system elapsed
# 7.277 0.000 7.338
system.time(dist2 <- rdist(points))
# user system elapsed
# 2.756 0.060 2.851
class(dist2)
# [1] "matrix"
dim(dist2)
# [1] 1000 1000
dist2[1:3, 1:3]
# [,1] [,2] [,3]
# [1,] 0.0000000001 3.9529674733 3.8051198575
# [2,] 3.9529674733 0.0000000001 3.6552146293
# [3,] 3.8051198575 3.6552146293 0.0000000001
这篇关于如何操作/访问“dist”实例的元素类使用核心R?的文章就介绍到这了,希望我们推荐的答案对大家有所帮助,也希望大家多多支持IT屋!