在dplyr中解释ungroup() [英] Explain ungroup() in dplyr

查看:573
本文介绍了在dplyr中解释ungroup()的处理方法,对大家解决问题具有一定的参考价值,需要的朋友们下面随着小编来一起学习吧!

问题描述

如果我正在使用数据集,并且想对数据进行分组(即按国家分组),请计算汇总统计信息(平均值( )),然后 ungroup() data.frame 包含原始尺寸(国家-)和新列,其中列出了每个国家/地区的平均值(重复n年) ,我该如何使用 dplyr ungroup()函数不会返回具有原始尺寸的 data.frame

If I'm working with a dataset and I want to group the data (i.e. by country), compute a summary statistic (mean()) and then ungroup() the data.frame to have a dataset with the original dimensions (country-year) and a new column that lists the mean for each country (repeated over n years), how would I do that with dplyr? The ungroup() function doesn't return a data.frame with the original dimensions:

gapminder %>%
    group_by(country) %>%
    summarize(mn = mean(pop)) %>%
    ungroup() # returns data.frame with nrows == length(unique(gapminder$country))


推荐答案

ungroup()如果您想做类似的事情

ungroup() is useful if you want to do something like

gapminder %>%
group_by(country) %>%
mutate(mn = pop/mean(pop)) %>%
ungroup() 

要在其中进行某种转换的函数组的统计信息。在上面的示例中, mn 是人口与该组平均人口的比率。如果取消组合,则对其进行的任何其他变异都不会使用该分组进行汇总统计。

where you want to do some sort of transformation that uses an entire group's statistics. In the above example, mn is the ratio of a population to the group's average population. When it is ungrouped, any further mutations called on it would not use the grouping for aggregate statistics.

汇总自动减小尺寸,并且没有办法将其取回。也许您想做

summarize automatically reduces the dimensions, and there's no way to get that back. Perhaps you wanted to do

gapminder %>%
group_by(country) %>%
mutate(mn = mean(pop)) %>%
ungroup() 

为每个组创建 mn 作为平均值,并对该组中的每一行进行复制。

Which creates mn as the mean for each group, replicated for each row within that group.

这篇关于在dplyr中解释ungroup()的文章就介绍到这了,希望我们推荐的答案对大家有所帮助,也希望大家多多支持IT屋!

查看全文
登录 关闭
扫码关注1秒登录
发送“验证码”获取 | 15天全站免登陆