在多个条件下求和 [英] Summing in R with multiple conditions

查看:95
本文介绍了在多个条件下求和的处理方法,对大家解决问题具有一定的参考价值,需要的朋友们下面随着小编来一起学习吧!

问题描述

我试图将第4列(儿童),5列(成人)和6列(老年人)相加,并按年份返回每个国家/地区的值,而不考虑第3列(性别)。在各种论坛上阅读时,我无法将这些内容结合在一起:

I'm trying to sum columns 4 (child) ,5 (adult) and 6 (elderly) and return values for each country by year disregarding column 3 (sex). Reading through various forums I cannot combine these:

 country      year   sex  child adult elderly
1 Afghanistan 1995   male    -1    -1      -1
2 Afghanistan 1996 female    -1    -1      -1
3 Afghanistan 1996   male    -1    -1      -1
4 Afghanistan 1997 female     5    96       1
5 Afghanistan 1997   male     0    26       0
6 Afghanistan 1998 female    45  1142      20

我能够将3逐行创建一个单独的列,并包含以下内容,但仍然需要针对每个国家/地区组合男性和女性行:

I was able to sum the 3 columns by row and create a separate column with the following but still need to combine the male and female rows for each country:

tuberculosiscases <-tuberculosis$child + tuberculosis$adult + tuberculosis$elderly
names(tuberculosiscases) <- c("tuberculosiscases")
tuberculosis <- data.frame(tuberculosis,tuberculosiscases)
head(tuberculosis)

   country    year   sex child adult elderly  tuberculosiscases
1 Afghanistan 1995   male    -1    -1      -1                -3
2 Afghanistan 1996 female    -1    -1      -1                -3
3 Afghanistan 1996   male    -1    -1      -1                -3
4 Afghanistan 1997 female     5    96       1               102
5 Afghanistan 1997   male     0    26       0                26
6 Afghanistan 1998 female    45  1142      20              1207


推荐答案

如果要将总和添加到数据框中,则有以下几种选择:

If you want add the sum to your dataframe, have several options:

# with base R (1)
transform(dat, tuber.sum = ave(tuberculosiscases, country, year, FUN = sum))

# with base R (2)
dat$tuber.sum <- ave(dat$tuberculosiscases, dat$country, dat$year, FUN = sum))

# with the data.table package
library(data.table)
setDT(dat)[, tuber.sum:=sum(tuberculosiscases), by= .(country, year)]

# with the plyr package
library(plyr)
dat <- ddply(dat, .(country, year), transform, tuber.sum=sum(tuberculosiscases))

# with the dplyr package
library(dplyr)
dat <- dat %>% 
  group_by(country, year) %>% 
  mutate(tuber.sum=sum(tuberculosiscases))

全部给:

> dat
       country year    sex child adult elderly tuberculosiscases tuber.sum
1: Afghanistan 1995   male    -1    -1      -1                -3        -3
2: Afghanistan 1996 female    -1    -1      -1                -3        -6
3: Afghanistan 1996   male    -1    -1      -1                -3        -6
4: Afghanistan 1997 female     5    96       1               102       128
5: Afghanistan 1997   male     0    26       0                26       128
6: Afghanistan 1998 female    45  1142      20              1207      1207

这篇关于在多个条件下求和的文章就介绍到这了,希望我们推荐的答案对大家有所帮助,也希望大家多多支持IT屋!

查看全文
登录 关闭
扫码关注1秒登录
发送“验证码”获取 | 15天全站免登陆