R：按组计算Pearson相关和R平方 [英] R: Calculating Pearson correlation and R-squared by group

查看：249 发布时间：2017/3/26 0:24:45 r dataframe correlation

本文介绍了R：按组计算Pearson相关和R平方的处理方法，对大家解决问题具有一定的参考价值，需要的朋友们下面随着小编来一起学习吧！

问题描述

为了获得一年中每个月的温湿度相关性（1 = 1月），我们将每个月必须做同样的事情（12次）。

  cor（airquality [airquality $ Month == 1，c Temp，Humidity）]）

有没有办法自动做每个月？

在我的情况下，我想要测试相关性的30多个组（不是几个月，而不是几个），我只想知道是否有更快的

谢谢！

解决方案

  cor（airquality [airquality $ Month == 1，c（Temp，Humidity）]）

给你一个 2 * 2 协方差矩阵而不是一个数字。我打赌你想为每个月单个号码，所以使用

## cor（Temp，Humidity | Month） with（airquality，mapply（cor，split（Temp，Month），split（Humidity，Month））） pre>

，您将获得一个向量。

阅读？ 和？mapply ;它们对按组操作非常有用，尽管它们不是唯一的选择。另请阅读？cor ，并比较

  a< ;  -  rnorm（10）
b<  -  rnorm（10）
 cor（a，b）
 cor（cbind（a，b））

您在问题中链接的答案是执行类似于 cor（cbind（a，b））。

可重现示例

R中的 airquality 数据集不具有 Humidity 列，因此我将使用风用于测试：

  ## cor（Temp，Wind | Month）
x <  -  with（airquality，mapply（cor，split（Temp，Month），split（Wind，Month））
 
＃5 6 7 8 9 
＃-0.3732760 -0.1210353  - 0.3052355 -0.5076146 -0.5704701

我们得到一个命名矢量，其中 names ）给出 Month 和 unname（x）给出相关性。

非常感谢！它工作完美！我试图找出如何获得一个矢量与 R ^ 2 每个相关性，但我不能...任何想法？

cor（x，y）就像拟合一个标准化的线性回归模型：

  coef（lm（scale（y）〜scale（x） -  1））##记得删除截图

这个简单线性回归中的R平方只是斜率的平方。以前，我们有 x 存储每个组的相关性，现在R平方只是 x ^ 2 。

I am trying to extend the answer of a question R: filtering data and calculating correlation.

To obtain the correlation of temperature and humidity for each month of the year (1 = January), we would have to do the same for each month (12 times).

cor(airquality[airquality$Month == 1, c("Temp", "Humidity")])

Is there any way to do each month automatically?

In my case I have more than 30 groups (not months but species) to which I would like to test for correlations, I just wanted to know if there is a faster way than doing it one by one.

Thank you!

解决方案

cor(airquality[airquality$Month == 1, c("Temp", "Humidity")])

gives you a 2 * 2 covariance matrix rather than a number. I bet you want a single number for each Month, so use

## cor(Temp, Humidity | Month)
with(airquality, mapply(cor, split(Temp, Month), split(Humidity, Month)) )

and you will obtain a vector.

Have a read around ?split and ?mapply; they are very useful for "by group" operations, although they are not the only option. Also read around ?cor, and compare the difference between

a <- rnorm(10)
b <- rnorm(10)
cor(a, b)
cor(cbind(a, b))

The answer you linked in your question is doing something similar to cor(cbind(a, b)).

Reproducible example

The airquality dataset in R does not have Humidity column, so I will use Wind for testing:

## cor(Temp, Wind | Month)
x <- with(airquality, mapply(cor, split(Temp, Month), split(Wind, Month)) )

#         5          6          7          8          9 
#-0.3732760 -0.1210353 -0.3052355 -0.5076146 -0.5704701

We get a named vector, where names(x) gives Month, and unname(x) gives correlation.

Thank you very much! It worked just perfectly! I was trying to figure out how to obtain a vector with the R^2 for each correlation too, but I can't... Any ideas?

cor(x, y) is like fitting a standardised linear regression model:

coef(lm(scale(y) ~ scale(x) - 1))  ## remember to drop intercept

The R-squared in this simple linear regression is just the square of the slope. Previously we have x storing correlation per group, now R-squared is just x ^ 2.

这篇关于R：按组计算Pearson相关和R平方的文章就介绍到这了，希望我们推荐的答案对大家有所帮助，也希望大家多多支持IT屋！

查看全文

R：按组计算Pearson相关和R平方 [英] R: Calculating Pearson correlation and R-squared by group

问题描述

相关文章

其他开发最新文章

热门教程

热门工具

登录关闭

R：按组计算Pearson相关和R平方 [英] R: Calculating Pearson correlation and R-squared by group

问题描述

相关文章

其他开发最新文章

热门教程

热门工具

登录 关闭

登录关闭