总结与“其他”组 [英] Summarize with dplyr "other then" groups

查看：118 发布时间：2017/7/13 22:04:33 r dplyr group-summaries

本文介绍了总结与“其他”组的处理方法，对大家解决问题具有一定的参考价值，需要的朋友们下面随着小编来一起学习吧！

问题描述

我需要在一个分组的data_frame中进行总结（警告：一个解决方案非常感谢，但不是强制性的）每个组（简单的）和其他组上相同的东西。

I need to summarize in a grouped data_frame (warn: a solution with dplyr is very much appreciated but isn't mandatory) both something on each group (simple) and the same something on "other" groups.

最小化示例

if(!require(pacman)) install.packages(pacman)
pacman::p_load(dplyr)

df <- data_frame(
    group = c('a', 'a', 'b', 'b', 'c', 'c'),
    value = c(1, 2, 3, 4, 5, 6)
)

res <- df %>%
    group_by(group) %>%
    summarize(
        median        = median(value)
#        median_other  = ... ??? ... # I need the median of all "other"
                                     # groups
#        median_before = ... ??? ... # I need the median of groups (e.g
                                 #    the "before" in alphabetic order,
                                 #    but clearly every roule which is
                                 #    a "selection function" depending
                                 #    on the actual group is fine)
    )

我的预期结果是以下

group    median    median_other    median_before
  a        1.5         4.5               NA
  b        3.5         3.5               1.5
  c        5.5         2.5               2.5

我在谷歌搜索字符串类似dplyr总结排除组，dplyr总结其他组，我已经在dplyr文档中搜索，但是我无法找到解决方案。

I've searched on Google strings similar to "dplyr summarize excluding groups", "dplyr summarize other then group",I've searched on the dplyr documentation but I wasn't able to find a solution.

这里， a href =https://stackoverflow.com/questions/34327780/how-to-summarize-value-not-matching-the-group-using-dplyr>如何使用dplyr来总结不符合组的值）不适用，因为它仅以总和运行，即解决方案功能特定（并且使用简单的算术函数，没有考虑每个组的变异性）。更复杂的功能请求（即，平均值，sd或用户功能）呢？： - ）

here, this (How to summarize value not matching the group using dplyr) does not apply because it runs only on sum, i.e. is a solution "function-specific" (and with a simple arithmetic function that did not consider the variability on each group). What about more complex function request (i.e. mean, sd, or user-function)? :-)

感谢所有

PS： summarize（）是一个例子，同样的问题导致 mutate（）或其他基于组工作的dplyr函数。

PS: summarize() is an example, the same question leads to mutate() or other dplyr-functions working based on groups.

推荐答案

这是我的解决方案：

res <- df %>%
  group_by(group) %>%
  summarise(med_group = median(value),
            med_other = (median(df$value[df$group != group]))) %>% 
  mutate(med_before = lag(med_group))

> res
Source: local data frame [3 x 4]

      group med_group med_other med_before
  (chr)     (dbl)     (dbl)      (dbl)
1     a       1.5       4.5         NA
2     b       3.5       3.5        1.5
3     c       5.5       2.5        3.5

I正在试图提出一个全面的解决方案，但是基本的R子集可以用 median（df $ value [df $ group！= group]）返回中值所有观察结果不在当前组。

I was trying to come up with an all-dplyr solution but base R subsetting works just fine with median(df$value[df$group != group]) returning the median of all observations that are not in the current group.

我希望这有助于您解决问题。

I hope this help you to solve your problem.

这篇关于总结与“其他”组的文章就介绍到这了，希望我们推荐的答案对大家有所帮助，也希望大家多多支持IT屋！

查看全文

总结与“其他”组 [英] Summarize with dplyr "other then" groups

问题描述

推荐答案

相关文章

其他开发语言最新文章

热门教程

热门工具

登录关闭

总结与“其他”组 [英] Summarize with dplyr &quot;other then&quot; groups

问题描述

推荐答案

相关文章

其他开发语言最新文章

热门教程

热门工具

登录 关闭

总结与“其他”组 [英] Summarize with dplyr "other then" groups

登录关闭