在R，按组中的数据框上运行自定义函数 [英] Run a custom function on a data frame in R, by group

查看：274 发布时间：2017/7/13 20:40:55 r function aggregate dplyr

本文介绍了在R，按组中的数据框上运行自定义函数的处理方法，对大家解决问题具有一定的参考价值，需要的朋友们下面随着小编来一起学习吧！

问题描述

以下是一些示例数据：

  set.seed（42）
 tm<  -  as.numeric（c（1，2，3，3 ，2，1，2，3，1，1））
d < -  as.numeric（sample（0：2，size = 10，replace = TRUE ）
t < -  as.numeric（sample（0：2，size = 10，replace = TRUE））
h < -  as.numeric（sample（0：2，size = 10，replace = TRUE））
 
 df < -  as.data.frame（cbind（tm，d，t，h））
 df $ p < -  rowSums（df [2：4 ]）

我创建了一个自定义函数来计算值w：

  calc<  -  function（x）{
 data<  -  x 
w < - （1.27 * sum（data $ d） + 1.62 * sum（data $ t）+ 2.10 * sum（data $ h））/ sum（data $ p）
w 
}

当我在整个数据集上运行函数时，我得到以下答案：

 code> calc（df）
 [1] 1.664474

理想情况下，要返回按tm分组的结果，例如：

  tm w 
 1 calc的结果
 2的结果calc 
 3 calc的结果

到目前为止，我已经尝试使用 aggregate 与我的功能，但我收到以下错误：

  aggregate（df， by = list（tm），FUN = calc）
数据错误$ d：$ operator对原子向量无效

我觉得我盯着这个太久了，有一个明显的答案。任何建议将不胜感激。

解决方案

使用 dplyr

  library（dplyr）
 df％>％
 group_by（tm）％>％
 （data.frame（val = calc（。）））
＃tm val 
＃1 1 1.665882 
＃2 2 1.504545 
＃3 3 1.838000

如果我们稍微更改函数以包含多个参数，这也可以与总结

  calc1<  - 函数（d1，t1，h1，p1）{
（1.27 * sum （d1）+ 1.62 * sum（t1）+ 2.10 * sum（h1））/ sum（p1）} 
 df％>％
 group_by（tm）％>％
 summary （val = calc1（d，t，h，p））
＃tm val 
＃1 1 1.665882 
＃2 2 1.504545 
＃3 3 1.838000

Having some trouble getting a custom function to loop over a group in a data frame.



Here is some sample data:
set.seed(42)
tm <- as.numeric(c("1", "2", "3", "3", "2", "1", "2", "3", "1", "1"))
d <- as.numeric(sample(0:2, size = 10, replace = TRUE))
t <- as.numeric(sample(0:2, size = 10, replace = TRUE))
h <- as.numeric(sample(0:2, size = 10, replace = TRUE))

df <- as.data.frame(cbind(tm, d, t, h))
df$p <- rowSums(df[2:4])
I created a custom function to calculate the value w:
calc <- function(x) {
  data <- x
  w <- (1.27*sum(data$d) + 1.62*sum(data$t) + 2.10*sum(data$h)) / sum(data$p)
  w
  }
When I run the function on the entire data set, I get the following answer:
calc(df)
[1]1.664474
Ideally, I want to return results that are grouped by tm, e.g.:
tm     w
1    result of calc
2    result of calc
3    result of calc
So far I have tried using aggregate with my function, but I get the following error:
aggregate(df, by = list(tm), FUN = calc)
Error in data$d : $ operator is invalid for atomic vectors
I feel like I have stared at this too long and there is an obvious answer. Any advice would be appreciated.
 解决方案 
Using dplyr
library(dplyr)
df %>% 
   group_by(tm) %>%
   do(data.frame(val=calc(.)))
#  tm      val
#1  1 1.665882
#2  2 1.504545
#3  3 1.838000
If we change the function slightly to include multiple arguments, this could also work with summarise
 calc1 <- function(d1, t1, h1, p1){
      (1.27*sum(d1) + 1.62*sum(t1) + 2.10*sum(h1) )/sum(p1) }
 df %>%
     group_by(tm) %>% 
     summarise(val=calc1(d, t, h, p))
 #  tm      val
 #1  1 1.665882
 #2  2 1.504545
 #3  3 1.838000


                        
这篇关于在R，按组中的数据框上运行自定义函数的文章就介绍到这了，希望我们推荐的答案对大家有所帮助，也希望大家多多支持IT屋！


                    
                        查看全文

在R，按组中的数据框上运行自定义函数 [英] Run a custom function on a data frame in R, by group

问题描述

相关文章

其他开发语言最新文章

热门教程

热门工具

登录关闭

在R，按组中的数据框上运行自定义函数 [英] Run a custom function on a data frame in R, by group

问题描述

相关文章

其他开发语言最新文章

热门教程

热门工具

登录 关闭

登录关闭