Rowsums以循环中的列名为条件 [英] Rowsums conditional on column name in a loop

查看:197
本文介绍了Rowsums以循环中的列名为条件的处理方法,对大家解决问题具有一定的参考价值,需要的朋友们下面随着小编来一起学习吧!

问题描述

这是一则后续问题: Rowsums以列名称为条件

我的数据框称为 wiod ,如下所示: p>

My data frame is called wiod and looks like this:

VAR1 VAR2 AUS1 ... AUS56 BEL1 ... BEL56 NLD1 ... NLD56
A    D    23   ... 99    0    ... 444   123  ... 675
B    D    55   ... 6456  0    ... 557   567  ... 4345

我想计算变量 AUS,BEL,NLD 的行总和,然后删除旧的变量。像这样:

I'd like to calculate the row-sums for the variables AUS, BEL, NLD and then drop the old variables. Like this:

wiot <- wiot %>% 
  mutate(AUS = rowSums(.[grep("AUS", names(.))])) %>% 
  mutate(BEL = rowSums(.[grep("BEL", names(.))])) %>% 
  mutate(NLD = rowSums(.[grep("NLD", names(.))])) %>% 
  select(Var1, Var2, AUS, BEL, NLD)

当然,有大量的变量组,不只是这三个(43,要精确)。有没有任何方便的方法,而不使用43变异命令?

Of course, there is a large number of the variable groups, not just these three (43, to be precise). Is there any convenient way to do this without using 43 mutate commands?

推荐答案

它使得更容易从宽格式转换为长(收集),然后总结,如果需要转换回广泛(传播)格式:

It makes it easier to convert from wide format to long (gather), then summarise, and if needed convert back to wide (spread) format:

library(dplyr)
library(tidyr)

# dataframe from @989 http://stackoverflow.com/a/43519062
df1 %>% 
  gather(key = myKey, value = myValue, -c(VAR1, VAR2)) %>% 
  mutate(myGroup = gsub("\\d", "", myKey)) %>% 
  group_by(VAR1, VAR2, myGroup) %>% 
  summarise(mySum = sum(myValue)) %>% 
  spread(key = myGroup, value = mySum)

# Source: local data frame [2 x 5]
# Groups: VAR1, VAR2 [2]
# 
#     VAR1   VAR2   AUS   BEL   NLD
# * <fctr> <fctr> <int> <int> <int>
# 1      A      D   122   444   798
# 2      B      D  6511   557  4912

这篇关于Rowsums以循环中的列名为条件的文章就介绍到这了,希望我们推荐的答案对大家有所帮助,也希望大家多多支持IT屋!

查看全文
登录 关闭
扫码关注1秒登录
发送“验证码”获取 | 15天全站免登陆