在多对列上提取和格式化 cor.test 的结果 [英] Extracting and formatting results of cor.test on multiple pairs of columns

查看:36
本文介绍了在多对列上提取和格式化 cor.test 的结果的处理方法,对大家解决问题具有一定的参考价值,需要的朋友们下面随着小编来一起学习吧!

问题描述

我正在尝试生成相关矩阵的表格输出.具体来说,我使用 for 循环来确定第 4:40 列到第 1 列中所有数据之间的相关性.虽然该表的结果不错,但它不能确定正在比较的内容到什么.在检查 cor.test 的属性时,我发现 data.name 被指定为 x[1]y[1] 这是不足以追溯哪些列正在与哪些列进行比较.这是我的代码:

I am trying to generate a table output of a correlation matrix. Specifically, I am using a for loop in order to identify a correlation between all data in columns 4:40 to column 1. While the results of the table are decent, it does not identify what is being compared to what. In checking attributes of cor.test,I find that data.name is being given as x[1] and y[1] which is not good enough to trace back which columns is being compared to what. Here is my code:

input <- read.delim(file="InputData.txt", header=TRUE)
x<-input[,41, drop=FALSE]
y=input[,4:40]
corr.values <- vector("list", 37)
for (i in 1:length(y) ){
  corr.values[[i]] <- cor.test(x[[1]], y[[i]], method="pearson")
}
lres <- sapply(corr.values, `[`, c("statistic","p.value","estimate","method", "data.name"))
lres<-t(lres)
write.table(lres, file="output.xls", sep="	",row.names=TRUE)

输出文件如下所示:

       statistic        p.value     estimate                                  method            data.name   
1   -2.030111981    0.042938137 -0.095687495    Pearson's product-moment correlation    x[[1]] and y[[i]]
2   -2.795786248    0.005400938 -0.131239287    Pearson's product-moment correlation    x[[1]] and y[[i]]
3   -2.099114632    0.036368337 -0.098908573    Pearson's product-moment correlation    x[[1]] and y[[i]]
4   -1.920649487    0.055413178 -0.090571599    Pearson's product-moment correlation    x[[1]] and y[[i]]
5   -1.981326962    0.048168291 -0.093408365    Pearson's product-moment correlation    x[[1]] and y[[i]]
6   -2.80390736      0.00526909 -0.131613912    Pearson's product-moment correlation    x[[1]] and y[[i]]
7   -1.265138839    0.206482153 -0.059798855    Pearson's product-moment correlation    x[[1]] and y[[i]]
8   -2.861448156    0.004415411 -0.134266636    Pearson's product-moment correlation    x[[1]] and y[[i]]
9   -2.103403363    0.035990039 -0.099108672    Pearson's product-moment correlation    x[[1]] and y[[i]]
10  -3.610094985    0.000340807 -0.168498786    Pearson's product-moment correlation    x[[1]] and y[[i]]

显然,这并不完美,因为行被编号并且无法分辨哪个相关性是什么.有没有办法解决这个问题?我尝试了很多解决方案,但都没有奏效.我知道技巧必须在编辑 data.name 属性中,但是我不知道该怎么做.

Clearly, this is not perfect as rows are numbered and can't tell which correlation is to what. Is there a way to fix this? I tried many solutions but none worked.I know that the trick must be in editing the data.name attribute however I couldn't figure out how to do that.

推荐答案

这是一种返回包含所有 cor.test 结果的数据框的方法,该数据框还包括每个变量的名称计算相关性:我们创建一个函数来提取 cor.test 的相关结果,然后使用 mapply 将该函数应用于我们想要相关性的每一对变量.mapply 返回一个列表,所以我们使用 do.call(rbind, ...) 把它变成一个数据框.

Here's a way to return a data frame with all the cor.test results that also includes the names of the variables for which each correlation was calculated: We create a function to extract the relevant results of cor.test then use mapply to apply the function to each pair of variables for which we want the correlations. mapply returns a list, so we use do.call(rbind, ...) to turn it into a data frame.

# Function to extract correlation coefficient and p-values
corrFunc <- function(var1, var2, data) {
  result = cor.test(data[,var1], data[,var2])
  data.frame(var1, var2, result[c("estimate","p.value","statistic","method")], 
             stringsAsFactors=FALSE)
}

## Pairs of variables for which we want correlations
vars = data.frame(v1=names(mtcars)[1], v2=names(mtcars)[-1])

# Apply corrFunc to all rows of vars
corrs = do.call(rbind, mapply(corrFunc, vars[,1], vars[,2], MoreArgs=list(data=mtcars), 
                              SIMPLIFY=FALSE))

     var1 var2   estimate      p.value statistic                               method
cor   mpg  cyl -0.8475514 9.380327e-10 -8.747152 Pearson's product-moment correlation
cor1  mpg disp -0.7761684 1.787835e-07 -6.742389 Pearson's product-moment correlation
cor2  mpg   hp  0.4186840 1.708199e-02  2.525213 Pearson's product-moment correlation
cor3  mpg drat  0.6811719 1.776240e-05  5.096042 Pearson's product-moment correlation
cor4  mpg   wt  0.4802848 5.400948e-03  2.999191 Pearson's product-moment correlation
cor5  mpg qsec  0.6640389 3.415937e-05  4.864385 Pearson's product-moment correlation
cor6  mpg   vs  0.5998324 2.850207e-04  4.106127 Pearson's product-moment correlation
cor7  mpg   am  1.0000000 0.000000e+00       Inf Pearson's product-moment correlation
cor8  mpg gear -0.8676594 1.293959e-10 -9.559044 Pearson's product-moment correlation
cor9  mpg carb -0.8521620 6.112687e-10 -8.919699 Pearson's product-moment correlation

这篇关于在多对列上提取和格式化 cor.test 的结果的文章就介绍到这了,希望我们推荐的答案对大家有所帮助,也希望大家多多支持IT屋!

查看全文
登录 关闭
扫码关注1秒登录
发送“验证码”获取 | 15天全站免登陆