如何计算R中的相关性 [英] How to calculate correlation In R
问题描述
我想计算R中的一个数据集x的子集的列之间的相关系数
我有40个模型的行,每8000个行中总共有200个仿真
我想计算之间的相关系数每次模拟的列(40行)
I wanted to calculate correlation coeficient between colunms of a subset of a data set x in R I have rows of 40 models each 200 simulations in total 8000 rows I wanted to calculate the corr coeficient between colums for each simulation (40 rows)
cor(x [c(3,5)])
计算得出全部8000行
我需要 cor(x [c(3,5)])
,但仅当 X $ nsimul = 1
等等
cor(x[c(3,5)])
calculates from all 8000 rows
I need cor(x[c(3,5)])
but only when X$nsimul=1
and so on
在这方面,您会帮我吗?
San
would you help me in this regards San
推荐答案
我不确定您对 x [c(3,5)]
到底在做什么例如,您想要执行以下操作:您有一个数据框 X
这样:
I'm not sure what exactly you're doing with x[c(3,5)]
but it looks like you want to do something like the following: You have a data-frame X
like this:
set.seed(123)
X <- data.frame(nsimul = rep(1:2, each=5), a = sample(1:10), b = sample(1:10))
> X
nsimul a b
1 1 1 6
2 1 8 2
3 1 9 1
4 1 10 4
5 1 3 9
6 2 4 8
7 2 6 5
8 2 7 7
9 2 2 10
10 2 5 3
您想用 nsimul
列拆分此数据帧,并计算<$每个组中的c $ c> a 和 b
。这是一个经典的 split-apply-combine
问题, plyr
软件包非常适合:
And you want to split this data-frame by the nsimul
column, and calculate the correlation between a
and b
in each group. This is a classic split-apply-combine
problem for which the plyr
package is very well-suited:
require(plyr)
> ddply(X, .(nsimul), summarize, cor_a_b = cor(a,b))
nsimul cor_a_b
1 1 -0.7549232
2 2 -0.5964848
这篇关于如何计算R中的相关性的文章就介绍到这了,希望我们推荐的答案对大家有所帮助,也希望大家多多支持IT屋!