与cor.test在许多类别上进行循环 [英] for loop with cor.test over many categories

查看:52
本文介绍了与cor.test在许多类别上进行循环的处理方法,对大家解决问题具有一定的参考价值,需要的朋友们下面随着小编来一起学习吧!

问题描述

我正在尝试在R中编写一个循环,该循环将循环3种不同的物质,以计算两个连续变量(Redness和VarNormAbund)之间的相关性.

I am trying to write a loop in R that will cycle through 3 different species to calculate the correlation between two continuous variables (Redness and VarNormAbund).

我的循环正在运行,但是这3种动物的输出都相同,这使我认为该循环卡在了第一种动物上.

My loop is running, but the output for each of the 3 species is the same, which makes me think the loop is getting stuck at the first species.

 cor.test.redness<-lapply(unique(test$Species), function(x){cor.test(test$Redness, test$VarNormAbund)})

数据结构:物种在第一栏中.我希望循环提取每个物种并在Redness和VarNormAbund之间进行相关测试.每个类别都应该有一个输出.因此列表中有3个输出.

Data structure: Species are in the first column. I would like the loop to extract each species and do a cor.test between Redness and VarNormAbund. There should be an output for each category. So 3 outputs in the list.

我是否缺少一个参数,该参数告诉循环执行每个物种?

Am I missing an argument that tells the loop to do each Species?

还有,有什么办法可以使输出成为 data.frame 而不是列表?

Also, is there any way to have the output be a data.frame instead of a list?

任何建议将不胜感激,我对循环没有太多经验.

Any advice would be appreciated, I do not have much experience with loops.

Species<-c("A","B","C","A","B","C","A","B","C")
Redness<-c(1,1,1,2,2,2,3,3,3)
VarNormAbund<-c(1.6, 0,0,12.5,0,1,1.37, 2.74, 0)
test<-data.frame(Species, Redness, VarNormAbund)

欢呼.

推荐答案

对于这三个种类,您获得相同结果的原因是,即使您遍历唯一种类,您也没有对数据进行子集化,因此您的测试仍然在整个数据集上,一个简单的解决方法是:

The reason you get the same results for all the three species is that even you are looping through the unique species, you didn't subset your data so your test is still on the whole data set, a simple fix would be:

cor.test.redness<-lapply(unique(test$Species), function(x){
                         cor.test(test[test$Species == x, ]$Redness, 
                                  test[test$Species == x, ]$VarNormAbund)})

如果您希望输出成为数据帧,则可以从相关性测试结果中提取系数,并将其放入数据帧中,然后对结果进行 rbind .因此,例如,如果您想要数据帧为 p.value correlation coefficient ,则可以执行以下操作:

If you want the output to be a data frame, you can extract the coefficients from the correlation test result and put them in a data frame and then rbind the results. So if, for example, you want data frame of p.value and correlation coefficient, you can do:

cor.test.redness<-do.call(rbind, lapply(unique(test$Species), function(x){
                    cor.result <- cor.test(test[test$Species == x, ]$Redness, 
                                           test[test$Species == x, ]$VarNormAbund); 
                    data.frame(p.Value = cor.result$p.value, cor = cor.result$estimate)
                   }))

cor.test.redness
#        p.Value         cor
# cor  0.9884892 -0.01808019
# cor1 0.3333333  0.86602540
# cor2 1.0000000  0.00000000

您还可以在结果数据框中添加一列以指定种类.但是我相信您可以弄清楚那部分,所以请留给您.

You can also add a column to specify the species in the result data frame. But I believe you can figure that part out, so leave it to you.

注意:这种子设置可能会很慢,如果您的数据集很大且性能存在问题,则可以尝试使用 data.table 或 dplyr 的快速分组功能.

Note: this type of subsetting can be potentially slow, if your dataset is large and the performance is an issue, you can try to do the test using data.table or dplyr's fast groupby feature.

这篇关于与cor.test在许多类别上进行循环的文章就介绍到这了,希望我们推荐的答案对大家有所帮助,也希望大家多多支持IT屋!

查看全文
登录 关闭
扫码关注1秒登录
发送“验证码”获取 | 15天全站免登陆