Tibble数据类型的Rowwise求和 [英] Rowwise summation for Tibble datatype
问题描述
我有一个Tibble,我注意到 dplyr :: rowwise()
和 sum()
不行。我知道这个主题有很多线索,我有2到3个解决方案,但我不太明白为什么 rowwise()
和 sum()
不起作用
所以,我的问题是:为什么不组合 rowwise()
和 sum()
工作,我们可以做些什么来使其工作?我是初学者,所以我相信我在下面的代码中做错了。
数据:
dput(data)
结构(list(Fiscal.Year = c(2016L,2016L,2016L,2016L,2016L,
2016L, 2016L,2016L,2016L,2016L),col1 = c(0,26613797.764311,
0,12717073.587292,0,0,0,0,0,0),col2 = c(0,0,0,0, col3 = c(0,0,33251606.347943,
0,25082683.4492186,0,17337191.3014127,0,0,0),col4 = c(...),col4 = c(0,0,33251606.347943,
0,25082683.4492186,0,17337191.3014127,0,0,0)0,0483606.8417117,0,0,0) 0,
0,0,0,0,0,0,0,0,0),col5 = c(0,0,0,0,0,0,0,0,
0 ,9796823.229998),col6 = c(35822181.695755,17475066.870565,
0,0,0,0,4040695.327278,0,13117249.623068,0),col7 = c(0,
0,0,0,0 ,18347258.910001,0,0,7002205.087399,0),NoTrans = c(2987L,
1292L,1002L,796L,691L,677L,400L,388L,381L,366L)),Names = c ,
col1,col2,col3,col4,col5,col6,col7,NoTrans
) name = c(NA, -10L),class = c(tbl_df,tbl,data.frame
))
此代码无效:
data%>%#否
dplyr :: rowwise()%>%
dplyr :: mutate(sum = sum(。[2:8]))
只是为了参考,我已经尝试了以下代码集,他们工作。我专门寻找一个使用 rowwise()
和 sum()
的解决方案。
选项1:
讨论于:总结所有列
data%>%
/ pre>
dplyr :: rowwise()%>%
do(data.frame(。,res = sum(unlist(。)[2:8])))
选项2:
rowSums(data [,2:8])
选项3:
讨论于:如何使用列索引与dplyr对所选列进行横向求和?data%>%mutate sum = Reduce(+,。[2:8]))
选项4 :
数据%>%
select(2:8)%>%
dplyr :: mutate(sum = rowSums(。))
解决方案这些列看起来可疑,就像观察....
如果是这样,整理数据框会使数据争吵明显更容易。
这是否为您提供了您正在寻求的答案?
data%>%
gather(key = col,val = revenue,`col1`:`col7`)%>%
group_by(Fiscal.Year,No.Trans)% >%
summaryize(res = sum(revenue))
来源:本地数据框[10 x 3]
组:Fiscal.Year [?]
Fiscal.Year No.Trans res
< int> < INT> < DBL>
1 2016 366 9796823
2 2016 381 20119455
3 2016 388 0
4 2016 400 32861493
5 2016 677 18347259
6 2016 691 34052101
7 2016 796 12717074
8 2016 1002 33251606
9 2016 1292 44088865
10 2016 2987 35822182
为了真正顺利的思考,请仔细阅读 here 。他在演讲中讨论的功能已经更新,但Hadley做了一个很好的工作,教授这个主题:通过教学链接,因为它是。
可以找到更新的功能在他的 ggplot2 书籍这里一>。
I have a Tibble, and I have noticed that a combination of
dplyr::rowwise()
andsum()
doesn't work. I know there are many threads on this topic, and I have got 2 to 3 solutions, but I am not quite why the combination ofrowwise()
andsum()
doesn't work.So, my question is : why doesn't a combination of
rowwise()
andsum()
work AND what can we do to make it work? I am a beginner so I believe that I am doing something wrong in the below code.Data:
dput(data) structure(list(Fiscal.Year = c(2016L, 2016L, 2016L, 2016L, 2016L, 2016L, 2016L, 2016L, 2016L, 2016L), col1 = c(0, 26613797.764311, 0, 12717073.587292, 0, 0, 0, 0, 0, 0), col2 = c(0, 0, 0, 0, 8969417.89721166, 0, 11483606.8417117, 0, 0, 0), col3 = c(0, 0, 33251606.347943, 0, 25082683.4492186, 0, 17337191.3014127, 0, 0, 0), col4 = c(0, 0, 0, 0, 0, 0, 0, 0, 0, 0), col5 = c(0, 0, 0, 0, 0, 0, 0, 0, 0, 9796823.229998), col6 = c(35822181.695755, 17475066.870565, 0, 0, 0, 0, 4040695.327278, 0, 13117249.623068, 0), col7 = c(0, 0, 0, 0, 0, 18347258.910001, 0, 0, 7002205.087399, 0), No.Trans = c(2987L, 1292L, 1002L, 796L, 691L, 677L, 400L, 388L, 381L, 366L)), .Names = c("Fiscal.Year", "col1", "col2", "col3", "col4", "col5", "col6", "col7", "No.Trans" ), row.names = c(NA, -10L), class = c("tbl_df", "tbl", "data.frame" ))
This code doesn't work:
data %>% #No dplyr::rowwise() %>% dplyr::mutate(sum = sum(.[2:8]))
Just for reference, I have tried the following set of code, and they work. I am specifically looking for a solution that uses
rowwise()
andsum()
.Option 1: Discussed at: Summarise over all columns
data %>% dplyr::rowwise() %>% do(data.frame(., res = sum(unlist(.)[2:8])))
Option 2:
rowSums(data[,2:8])
Option 3: Discussed at:How to do rowwise summation over selected columns using column index with dplyr?
data %>% mutate(sum=Reduce("+",.[2:8]))
Option 4:
data %>% select(2:8)%>% dplyr::mutate(sum=rowSums(.))
解决方案Those columns look suspiciously like observations....
If so, tidying that dataframe up would make the data wrangling significantly easier.Does this get you the answers you are seeking?
data %>% gather(key = col, val = revenue, `col1`:`col7`) %>% group_by(Fiscal.Year, No.Trans) %>% summarise(res = sum(revenue)) Source: local data frame [10 x 3] Groups: Fiscal.Year [?] Fiscal.Year No.Trans res <int> <int> <dbl> 1 2016 366 9796823 2 2016 381 20119455 3 2016 388 0 4 2016 400 32861493 5 2016 677 18347259 6 2016 691 34052101 7 2016 796 12717074 8 2016 1002 33251606 9 2016 1292 44088865 10 2016 2987 35822182
For a really smooth introduction to thinking tidily please try here. The functions he discusses in the presentation have been updated, but Hadley does a great job teaching the subject: through pedagogical chaining, as it were.
The updated functions can be found in his ggplot2 book here.
这篇关于Tibble数据类型的Rowwise求和的文章就介绍到这了,希望我们推荐的答案对大家有所帮助,也希望大家多多支持IT屋!