执行矩阵的成对比较 [英] Perform pairwise comparison of matrix

查看:159
本文介绍了执行矩阵的成对比较的处理方法,对大家解决问题具有一定的参考价值,需要的朋友们下面随着小编来一起学习吧!

问题描述

我有一个n个变量的矩阵,我想创建一个新的矩阵,它是每个向量的成对差异,但不是自身。这里是一个数据的例子。

  Transportation.services Recreational.goods.and.vehicles Recreation.services Other.services 
2.958003 -0.25983789 5.526694 2.8912009
2.857370 -0.03425164 5.312857 2.9698044
2.352275 0.30536569 4.596742 2.9190123
2.093233 0.65920773 4.192716 3.2567390
1.991406 0.92246531 3.963058 3.6298314
2.065791 1.06120930 3.692287 3.4422340

我试着在下面运行for循环,但是我知道R循环很慢。 p>

  Difference.Matrix<  -  function(data){
n <-2
new.cols =New列
list = list()
for(i in 1:ncol(data)){$ b $ (数据)){

名称< - paste(diff,i,j,data [,i],data [,j] ,
new< - data [,i] -data [,j]
list [[new.cols]]< -c(name)
data< (数据,新)
}
n = n + 1
}
结果< -list(data = data)
return(results)
}

正如我之前所说,代码运行速度很慢,甚至还没有完成一个单一的运行。另外,我对初学者的编码表示歉意。另外我知道这个代码将原始数据留在矩阵上,但是我可以稍后删除它。

是否可以使用apply函数或foreach数据?

解决方案

您可以找到 combn code> apply $ / code>来创建结果:

$ $ p $ apply $(combn(ncol(d) ,2),2,function(x)d [,x [1]] - d [,x [2]])
## [,1] [,2] [,3] [,4] [,5] [,6]
## [1,] 3.217841 -2.568691 0.0668021 -5.786532 -3.151039 2.6354931
## [2,] 2.891622 -2.455487 -0.1124344 -5.347109 -3.004056 2.3430526
## [3] 2.046909 -2.244467 -0.5667373 -4.291376 -2.613647 1.6777297
## [4,] 1.434025 -2.099483 -1.1635060 -3.533508 -2.597531 0.9359770
## [5,] 1.068941 -1.971652 -1.6384254 -3.040593 -2.707366 0.3332266
## [6,] 1.004582 -1.626496 -1.3764430 -2.631078 -2.381025 0.2500530

您可以使用a添加适当的名称nother apply 。这里的列名非常长,这会损害格式,但是标签会告诉我们每一列有什么不同:

  x< ;  -  apply(combn(ncol(d),2),2,function(x)d [,x [1]]  -  d [,x [2]])
colnames(x)< apply (combn(ncol(d),2),2,function(x)paste(names(d)[x],collapse =' - '))
> x
Transportation.services - Recreational.goods.and.vehicles Transportation.services - Recreation.services
[1,] 3.217841 -2.568691
[2,] 2.891622 -2.455487
[ 3,] 2.046909 -2.244467
[4,] 1.434025 -2.099483
[5,] 1.068941 -1.971652
[6,] 1.004582 -1.626496
Transportation.services - Other.services Recreational.goods.and.vehicles - Recreation.services
[1,] 0.0668021 -5.786532
[2,] -0.1124344 -5.347109
[3,] -0.5667373 -4.291376
[4,] -1.1635060 -3.533508
[5,] -1.6384254 -3.040593
[6,] -1.3764430 -2.631078
Recreational.goods.and.vehicles - Other.services Recreation.services - Other.services
[1,] -3.151039 2.6354931
[ 2,] -3.004056 2.3430526
[3,] -2.613647 1.6777297
[4,] -2.597531 0.9359770
[5,] -2.707366 0.3332266
[6,] -2.381025 0.2500530


I have a matrix of n variables and I want to make an new matrix that is a pairwise difference of each vector, but not of itself. Here is an example of the data.

    Transportation.services Recreational.goods.and.vehicles Recreation.services Other.services
         2.958003                     -0.25983789            5.526694           2.8912009
         2.857370                     -0.03425164            5.312857           2.9698044
         2.352275                      0.30536569            4.596742           2.9190123
         2.093233                      0.65920773            4.192716           3.2567390
         1.991406                      0.92246531            3.963058           3.6298314
         2.065791                      1.06120930            3.692287           3.4422340

I tried running a for loop below, but I'm aware that R is very slow with loops.

Difference.Matrix<- function(data){
 n<-2
new.cols="New Columns"
list = list()
for (i in 1:ncol(data)){

    for (j in n:ncol(data)){

        name <- paste("diff",i,j,data[,i],data[,j],sep=".")
        new<- data[,i]-data[,j]
        list[[new.cols]]<-c(name)
        data<-merge(data,new)
        }
    n= n+1
    }
results<-list(data=data)
return(results)
}

As I said before the code is running very slow and has not even finished a single run through yet. Also I apologize for the beginner level coding. Also I am aware this code leaves the original data on the matrix, but I can delete it later.

Is it possible for me to use an apply function or foreach on this data?

解决方案

You can find the pairs with combn and use apply to create the result:

apply(combn(ncol(d), 2), 2, function(x) d[,x[1]] - d[,x[2]])
##          [,1]      [,2]       [,3]      [,4]      [,5]      [,6]
## [1,] 3.217841 -2.568691  0.0668021 -5.786532 -3.151039 2.6354931
## [2,] 2.891622 -2.455487 -0.1124344 -5.347109 -3.004056 2.3430526
## [3,] 2.046909 -2.244467 -0.5667373 -4.291376 -2.613647 1.6777297
## [4,] 1.434025 -2.099483 -1.1635060 -3.533508 -2.597531 0.9359770
## [5,] 1.068941 -1.971652 -1.6384254 -3.040593 -2.707366 0.3332266
## [6,] 1.004582 -1.626496 -1.3764430 -2.631078 -2.381025 0.2500530

You can add appropriate names with another apply. Here the column names are very long, which impairs the formatting, but the labels tell what differences are in each column:

x <- apply(combn(ncol(d), 2), 2, function(x) d[,x[1]] - d[,x[2]])
colnames(x) <- apply(combn(ncol(d), 2), 2, function(x) paste(names(d)[x], collapse=' - '))
> x
     Transportation.services - Recreational.goods.and.vehicles Transportation.services - Recreation.services
[1,]                                                  3.217841                                     -2.568691
[2,]                                                  2.891622                                     -2.455487
[3,]                                                  2.046909                                     -2.244467
[4,]                                                  1.434025                                     -2.099483
[5,]                                                  1.068941                                     -1.971652
[6,]                                                  1.004582                                     -1.626496
     Transportation.services - Other.services Recreational.goods.and.vehicles - Recreation.services
[1,]                                0.0668021                                             -5.786532
[2,]                               -0.1124344                                             -5.347109
[3,]                               -0.5667373                                             -4.291376
[4,]                               -1.1635060                                             -3.533508
[5,]                               -1.6384254                                             -3.040593
[6,]                               -1.3764430                                             -2.631078
     Recreational.goods.and.vehicles - Other.services Recreation.services - Other.services
[1,]                                        -3.151039                            2.6354931
[2,]                                        -3.004056                            2.3430526
[3,]                                        -2.613647                            1.6777297
[4,]                                        -2.597531                            0.9359770
[5,]                                        -2.707366                            0.3332266
[6,]                                        -2.381025                            0.2500530

这篇关于执行矩阵的成对比较的文章就介绍到这了,希望我们推荐的答案对大家有所帮助,也希望大家多多支持IT屋!

查看全文
登录 关闭
扫码关注1秒登录
发送“验证码”获取 | 15天全站免登陆