如何对基于字符串变量的数字变量的值求和 [英] How to sum the values of a numeric variable based on a string variable

查看:68
本文介绍了如何对基于字符串变量的数字变量的值求和的处理方法,对大家解决问题具有一定的参考价值,需要的朋友们下面随着小编来一起学习吧!

问题描述

请考虑以下数据框:

df <- data.frame(numeric=c(1,2,3,4,5,6,7,8,9,10), string=c("a", "a", "b", "b", "c", "d", "d", "e", "d", "f"))
print(df)
numeric string
1        1      a
2        2      a
3        3      b
4        4      b
5        5      c
6        6      d
7        7      d
8        8      e
9        9      d
10      10      f

它具有一个数字变量和一个字符串变量.现在,我想创建另一个数据框,其中的字符串变量仅显示唯一值"a","b","c","d","e","f"的列表,而数字变量为上一个数据帧中的数值之和的结果,导致该数据帧:

It has a numeric variable and a string variable. Now, I would like to create another dataframe in which the string variable displays only the list of unique values "a", "b", "c", "d", "e", "f", and the numeric variable is the result of the sum of the numeric valuesin the previous dataframe, resulting in this data frame:

print(new_df)
numeric string
1        3      a
2        7      b
3        5      c
4       22      d
5        8      e
6       10      f

这可以使用for循环来完成,但是在大型数据集中效率会很低,我更喜欢其他选项.我尝试使用 dplyr 包,但没有得到预期的结果:

This can be done using a for loop, but it would be rather inefficient in large datasets, and I would prefer other options. I have tried using dplyr package, but I did not get the expected result:

library(dplyr)
> df %>% group_by(string) %>% summarize(result = sum(numeric))
result
1     55

推荐答案

这可能是来自 plyr 的屏蔽功能的问题( summarise/mutate 函数也位于 plyr ).我们可以从 dplyr

It could be an issue of masking function from plyr (summarise/mutate functions are also there in plyr). We can explicitly specify the summarise from dplyr

library(dplyr)
df %>% 
    group_by(string) %>%
    dplyr::summarise(numeric = sum(numeric))

这篇关于如何对基于字符串变量的数字变量的值求和的文章就介绍到这了,希望我们推荐的答案对大家有所帮助,也希望大家多多支持IT屋!

查看全文
登录 关闭
扫码关注1秒登录
发送“验证码”获取 | 15天全站免登陆