如何从R中2xN信号的第2列创建NxM Corr矩阵? [英] How to create NxM Corr matrix from Column 2 of 2xN signals in R?
问题描述
我在files
中有2x N个1D信号,其中第1列是信号1,第2列是信号2.
代码1是关于1x N数量的一维信号的简化示例,而代码2是带有两个伪代码的实际目标,有关以下内容:
- 创建二维向量(
files[[i]] = i,i+1
)-每行中只有两个整数数据单元,用逗号隔开,并且 - 然后再访问那里的数据(
tcrossprod( files[[]][, 2], files[[]][, 2] )
),在那里我无法引用所有信号的所有第2列
简化的代码1可以正常工作
## Example with 1D vector in Single column
N <- 7
files <- vector("list", N)
# Make a list of two column data
for (i in 1:N) {
files[[i]] = i
}
str(files)
# http://stackoverflow.com/a/40323768/54964
tcrossprod( files, files )
代码2是伪代码,但是是目标
## Example with 2x1D vectors in two columns
N <- 7
files <- vector("list", N)
# Make a list of two column data
for (i in 1:N) {
files[[i]] = i,i+1 # PSEUDOCODE
}
str(files)
# access one signal single columns by files[[1]][,1] and files[[1]][,2]
tcrossprod( files[[]][, 2], files[[]][, 2] ) # PSEUDOCODE
假设向量1的尺寸为Nx1,向量1的尺寸为1xM.
每个单元(例如,通过files[[1]][,2]
访问信号2列2)包含1D信号.
将所有第2列的所有此类信号乘以trossprod
,您应该得到预期的结果:NxM
矩阵.
数学描述
数据:两列的列表,其中第一列是一维信号;第二列是改进的一维信号.我想在矩阵中一起比较那些改进的一维信号. 预期产量
cor Improved 1 Improved 2 ...
Improved 1 1 0.55
Improved 2 0.111 1
...
我不受任何特定R数据结构的束缚.
列和单元格只是我对数据单元中项目的描述.太不精确了,因为我是R的新手.
tchakravarty图形代码在我的系统中的输出,您看到x轴正确但y轴不正确
操作系统:Debian 8.5
R:3.1.1
我仍然不确定您的问题,因此我将首先尝试确定您所考虑的数据结构.
我创建了一个长度为M(= 100)的列表,每个元素都有一个代表2D信号的N x 2矩阵(其中N = 1000).library(dplyr)
library(ggplot2)
N = 1000
li_matrices = setNames(
lapply(paste("Improved", 1:100), function(x) matrix(rnorm(N*2), nrow = N, ncol = 2, byrow = TRUE)),
paste("Improved", 1:100))
> str(li_matrices, list.len = 5, max.level = 1)
List of 100
$ Improved 1 : num [1:1000, 1:2] 0.228 -0.44 0.713 -0.118 -0.918 ...
$ Improved 2 : num [1:1000, 1:2] 0.928 0.362 -0.105 -0.1 0.165 ...
$ Improved 3 : num [1:1000, 1:2] 0.0881 -0.1466 1.8549 -0.3376 -1.1626 ...
$ Improved 4 : num [1:1000, 1:2] 0.0575 -0.7809 0.4221 0.5378 -0.7882 ...
$ Improved 5 : num [1:1000, 1:2] 0.6739 1.4515 -0.0704 -0.1596 0.2157 ...
[list output truncated]
然后,我从M个列表元素的每一个中提取了信号的第二维,并计算了M个副本之间的信号相关性.
> cor(sapply(li_matrices, function(x) x[, 2]))
Improved 1 Improved 2 Improved 3 Improved 4 Improved 5 Improved 6 Improved 7
Improved 1 1.0000000000 -0.0181724914 0.0307864778 -0.0235266506 0.0681155904 -0.0654758679 -0.0416660418
Improved 2 -0.0181724914 1.0000000000 0.0837086793 -0.0310760562 0.0035757641 -0.0303866471 -0.0345608009
Improved 3 0.0307864778 0.0837086793 1.0000000000 -0.0093528744 0.0282039040 -0.0525328267 0.0410787784
Improved 4 -0.0235266506 -0.0310760562 -0.0093528744 1.0000000000 -0.0139707732 -0.0145970712 -0.0022037703
Improved 5 0.0681155904 0.0035757641 0.0282039040 -0.0139707732 1.0000000000 -0.0406468255 0.0381800143
Improved 6 -0.0654758679 -0.0303866471 -0.0525328267 -0.0145970712 -0.0406468255 1.0000000000 -0.0534592829
Improved 7 -0.0416660418 -0.0345608009 0.0410787784 -0.0022037703 0.0381800143 -0.0534592829 1.0000000000
Improved 8 -0.0320972342 -0.0344929079 -0.0204718584 -0.0007383034 0.0223386392 -0.0361548831 0.0090484961
Improved 9 0.0068743021 -0.0109232340 0.0071627901 0.0102613137 0.0265829001 -0.0443782611 0.0266421500
Improved 10 -0.0228804070 -0.0163596866 0.0066448268 0.0137962914 0.0357421845 0.0403325013 -0.0391002841
这是OP要求的绘图代码:
m_corr = cor(sapply(li_matrices, function(x) x[, 2]))
m_corr %>%
as.data.frame() %>%
rownames_to_column(var = "Var1") %>%
as_data_frame() %>%
gather(key = Var2, value = Value, -Var1) %>%
ggplot(
aes(
x = reorder(Var1, as.numeric(gsub("Improved ", "", Var1))),
y = reorder(Var2, as.numeric(gsub("Improved ", "", Var2))),
fill = Value
)
) +
geom_tile() +
theme_bw() +
theme(
axis.text.x = element_text(angle = 90, size = 5, hjust = 1),
axis.text.y = element_text(size = 5)
) +
xlab("Variable 1") +
ylab("Variable 2")
这给出了:
I have 2x N amount of 1D Signals in files
where Column 1 is Signal 1 and Column 2 Signal 2.
Code 1 is simplified example about 1x N amount of 1D signals, while Code 2 is the actual target with two pieces of pseudocode about:
- to create two dimensional vector (
files[[i]] = i,i+1
) - just two integer data units in each row separated by comma, and - and then accessing the data there later (
tcrossprod( files[[]][, 2], files[[]][, 2] )
) where I cannot refer to all columns 2 of all signals
Simplified Code 1 works as expected
## Example with 1D vector in Single column
N <- 7
files <- vector("list", N)
# Make a list of two column data
for (i in 1:N) {
files[[i]] = i
}
str(files)
# http://stackoverflow.com/a/40323768/54964
tcrossprod( files, files )
Code 2 is pseudocode but target
## Example with 2x1D vectors in two columns
N <- 7
files <- vector("list", N)
# Make a list of two column data
for (i in 1:N) {
files[[i]] = i,i+1 # PSEUDOCODE
}
str(files)
# access one signal single columns by files[[1]][,1] and files[[1]][,2]
tcrossprod( files[[]][, 2], files[[]][, 2] ) # PSEUDOCODE
Assume Vector 1 dimensions are Nx1 and Vector 1 1xM.
Each cell, accessed for instance for Signal 2 Column 2 by files[[1]][,2]
contains 1D signal.
Mutiply all such signals of Column 2 by trossprod
, you should get the expected result: NxM
matrix.
Mathematical description
Data: a list of two columns, where first column is 1D signal; 2nd column is improved 1D signal. I want to compare those improved 1D signals all together in the matrix. Expected output
cor Improved 1 Improved 2 ...
Improved 1 1 0.55
Improved 2 0.111 1
...
I am not tied to any particular R data structures .
Column and cell are just my descriptions of the items in the data units. So not precise because I am newbie in R.
Output of tchakravarty's graphic code in my system where you see x-axis is correct but not y-axis
OS: Debian 8.5
R: 3.1.1
I am still not sure of your question, so I will first try to make sure of the data structure that you have in mind.
I have created a list of length M (= 100) each element of which with an N x 2 matrix (where N = 1000) which represents the 2D signals.
library(dplyr)
library(ggplot2)
N = 1000
li_matrices = setNames(
lapply(paste("Improved", 1:100), function(x) matrix(rnorm(N*2), nrow = N, ncol = 2, byrow = TRUE)),
paste("Improved", 1:100))
> str(li_matrices, list.len = 5, max.level = 1)
List of 100
$ Improved 1 : num [1:1000, 1:2] 0.228 -0.44 0.713 -0.118 -0.918 ...
$ Improved 2 : num [1:1000, 1:2] 0.928 0.362 -0.105 -0.1 0.165 ...
$ Improved 3 : num [1:1000, 1:2] 0.0881 -0.1466 1.8549 -0.3376 -1.1626 ...
$ Improved 4 : num [1:1000, 1:2] 0.0575 -0.7809 0.4221 0.5378 -0.7882 ...
$ Improved 5 : num [1:1000, 1:2] 0.6739 1.4515 -0.0704 -0.1596 0.2157 ...
[list output truncated]
Then, I have extracted the second dimension of the signals from each of the M list elements, and computed their correlations across the M replicates.
> cor(sapply(li_matrices, function(x) x[, 2]))
Improved 1 Improved 2 Improved 3 Improved 4 Improved 5 Improved 6 Improved 7
Improved 1 1.0000000000 -0.0181724914 0.0307864778 -0.0235266506 0.0681155904 -0.0654758679 -0.0416660418
Improved 2 -0.0181724914 1.0000000000 0.0837086793 -0.0310760562 0.0035757641 -0.0303866471 -0.0345608009
Improved 3 0.0307864778 0.0837086793 1.0000000000 -0.0093528744 0.0282039040 -0.0525328267 0.0410787784
Improved 4 -0.0235266506 -0.0310760562 -0.0093528744 1.0000000000 -0.0139707732 -0.0145970712 -0.0022037703
Improved 5 0.0681155904 0.0035757641 0.0282039040 -0.0139707732 1.0000000000 -0.0406468255 0.0381800143
Improved 6 -0.0654758679 -0.0303866471 -0.0525328267 -0.0145970712 -0.0406468255 1.0000000000 -0.0534592829
Improved 7 -0.0416660418 -0.0345608009 0.0410787784 -0.0022037703 0.0381800143 -0.0534592829 1.0000000000
Improved 8 -0.0320972342 -0.0344929079 -0.0204718584 -0.0007383034 0.0223386392 -0.0361548831 0.0090484961
Improved 9 0.0068743021 -0.0109232340 0.0071627901 0.0102613137 0.0265829001 -0.0443782611 0.0266421500
Improved 10 -0.0228804070 -0.0163596866 0.0066448268 0.0137962914 0.0357421845 0.0403325013 -0.0391002841
Edit:
Here is the plotting code requested by OP:
m_corr = cor(sapply(li_matrices, function(x) x[, 2]))
m_corr %>%
as.data.frame() %>%
rownames_to_column(var = "Var1") %>%
as_data_frame() %>%
gather(key = Var2, value = Value, -Var1) %>%
ggplot(
aes(
x = reorder(Var1, as.numeric(gsub("Improved ", "", Var1))),
y = reorder(Var2, as.numeric(gsub("Improved ", "", Var2))),
fill = Value
)
) +
geom_tile() +
theme_bw() +
theme(
axis.text.x = element_text(angle = 90, size = 5, hjust = 1),
axis.text.y = element_text(size = 5)
) +
xlab("Variable 1") +
ylab("Variable 2")
This gives:
这篇关于如何从R中2xN信号的第2列创建NxM Corr矩阵?的文章就介绍到这了,希望我们推荐的答案对大家有所帮助,也希望大家多多支持IT屋!