在多对列上提取和格式化cor.test的结果 [英] Extracting and formatting results of cor.test on multiple pairs of columns
问题描述
我正在尝试生成相关矩阵的表输出。具体来说,我使用一个for循环来识别第4:40列到第1列中所有数据之间的相关性。虽然表的结果不错,但它不识别正在比较的内容什么。在检查 cor.test
的属性时,我发现data.name被指定为 x [1]
和 y [1]
不足以追溯到哪个列与哪个列进行比较。这是我的代码:
I am trying to generate a table output of a correlation matrix. Specifically, I am using a for loop in order to identify a correlation between all data in columns 4:40 to column 1. While the results of the table are decent, it does not identify what is being compared to what. In checking attributes of cor.test
,I find that data.name is being given as x[1]
and y[1]
which is not good enough to trace back which columns is being compared to what. Here is my code:
input <- read.delim(file="InputData.txt", header=TRUE)
x<-input[,41, drop=FALSE]
y=input[,4:40]
corr.values <- vector("list", 37)
for (i in 1:length(y) ){
corr.values[[i]] <- cor.test(x[[1]], y[[i]], method="pearson")
}
lres <- sapply(corr.values, `[`, c("statistic","p.value","estimate","method", "data.name"))
lres<-t(lres)
write.table(lres, file="output.xls", sep="\t",row.names=TRUE)
输出文件如下:
statistic p.value estimate method data.name
1 -2.030111981 0.042938137 -0.095687495 Pearson's product-moment correlation x[[1]] and y[[i]]
2 -2.795786248 0.005400938 -0.131239287 Pearson's product-moment correlation x[[1]] and y[[i]]
3 -2.099114632 0.036368337 -0.098908573 Pearson's product-moment correlation x[[1]] and y[[i]]
4 -1.920649487 0.055413178 -0.090571599 Pearson's product-moment correlation x[[1]] and y[[i]]
5 -1.981326962 0.048168291 -0.093408365 Pearson's product-moment correlation x[[1]] and y[[i]]
6 -2.80390736 0.00526909 -0.131613912 Pearson's product-moment correlation x[[1]] and y[[i]]
7 -1.265138839 0.206482153 -0.059798855 Pearson's product-moment correlation x[[1]] and y[[i]]
8 -2.861448156 0.004415411 -0.134266636 Pearson's product-moment correlation x[[1]] and y[[i]]
9 -2.103403363 0.035990039 -0.099108672 Pearson's product-moment correlation x[[1]] and y[[i]]
10 -3.610094985 0.000340807 -0.168498786 Pearson's product-moment correlation x[[1]] and y[[i]]
很显然,由于行编号和无法分辨出与哪个相关。有没有办法来解决这个问题?我尝试了许多解决方案,但都没有成功。我知道诀窍一定是编辑 data.name
属性,但是我不知道该怎么做。
Clearly, this is not perfect as rows are numbered and can't tell which correlation is to what. Is there a way to fix this? I tried many solutions but none worked.I know that the trick must be in editing the data.name
attribute however I couldn't figure out how to do that.
推荐答案
这是一种返回具有所有 cor.test
结果的数据框的方法其中还包括针对每个相关性进行计算的变量的名称:我们创建一个函数来提取 cor.test
的相关结果,然后使用 mapply
将该函数应用于我们想要相关的每对变量。 mapply
返回一个列表,因此我们使用 do.call(rbind,...)
将其转换为数据
Here's a way to return a data frame with all the cor.test
results that also includes the names of the variables for which each correlation was calculated: We create a function to extract the relevant results of cor.test
then use mapply
to apply the function to each pair of variables for which we want the correlations. mapply
returns a list, so we use do.call(rbind, ...)
to turn it into a data frame.
# Function to extract correlation coefficient and p-values
corrFunc <- function(var1, var2, data) {
result = cor.test(data[,var1], data[,var2])
data.frame(var1, var2, result[c("estimate","p.value","statistic","method")],
stringsAsFactors=FALSE)
}
## Pairs of variables for which we want correlations
vars = data.frame(v1=names(mtcars)[1], v2=names(mtcars)[-1])
# Apply corrFunc to all rows of vars
corrs = do.call(rbind, mapply(corrFunc, vars[,1], vars[,2], MoreArgs=list(data=mtcars),
SIMPLIFY=FALSE))
var1 var2 estimate p.value statistic method
cor mpg cyl -0.8475514 9.380327e-10 -8.747152 Pearson's product-moment correlation
cor1 mpg disp -0.7761684 1.787835e-07 -6.742389 Pearson's product-moment correlation
cor2 mpg hp 0.4186840 1.708199e-02 2.525213 Pearson's product-moment correlation
cor3 mpg drat 0.6811719 1.776240e-05 5.096042 Pearson's product-moment correlation
cor4 mpg wt 0.4802848 5.400948e-03 2.999191 Pearson's product-moment correlation
cor5 mpg qsec 0.6640389 3.415937e-05 4.864385 Pearson's product-moment correlation
cor6 mpg vs 0.5998324 2.850207e-04 4.106127 Pearson's product-moment correlation
cor7 mpg am 1.0000000 0.000000e+00 Inf Pearson's product-moment correlation
cor8 mpg gear -0.8676594 1.293959e-10 -9.559044 Pearson's product-moment correlation
cor9 mpg carb -0.8521620 6.112687e-10 -8.919699 Pearson's product-moment correlation
这篇关于在多对列上提取和格式化cor.test的结果的文章就介绍到这了,希望我们推荐的答案对大家有所帮助,也希望大家多多支持IT屋!