Tibble数据类型的Rowwise求和 [英] Rowwise summation for Tibble datatype

查看:261
本文介绍了Tibble数据类型的Rowwise求和的处理方法,对大家解决问题具有一定的参考价值,需要的朋友们下面随着小编来一起学习吧!

问题描述

我有一个Tibble,我注意到 dplyr :: rowwise() sum()不行。我知道这个主题有很多线索,我有2到3个解决方案,但我不太明白为什么 rowwise() sum()不起作用



所以,我的问题是:为什么不组合 rowwise() sum()工作,我们可以做些什么来使其工作?我是初学者,所以我相信我在下面的代码中做错了。



数据:

  dput(data)
结构(list(Fiscal.Year = c(2016L,2016L,2016L,2016L,2016L,
2016L, 2016L,2016L,2016L,2016L),col1 = c(0,26613797.764311,
0,12717073.587292,0,0,0,0,0,0),col2 = c(0,0,0,0, col3 = c(0,0,33251606.347943,
0,25082683.4492186,0,17337191.3014127,0,0,0),col4 = c(...),col4 = c(0,0,33251606.347943,
0,25082683.4492186,0,17337191.3014127,0,0,0)0,0483606.8417117,0,0,0) 0,
0,0,0,0,0,0,0,0,0),col5 = c(0,0,0,0,0,0,0,0,
0 ,9796823.229998),col6 = c(35822181.695755,17475066.870565,
0,0,0,0,4040695.327278,0,13117249.623068,0),col7 = c(0,
0,0,0,0 ,18347258.910001,0,0,7002205.087399,0),NoTrans = c(2987L,
1292L,1002L,796L,691L,677L,400L,388L,381L,366L)),Names = c ,
col1,col2,col3,col4,col5,col6,col7,NoTrans
) name = c(NA, -10L),class = c(tbl_df,tbl,data.frame
))

此代码无效:

  data%>%#否
dplyr :: rowwise()%>%
dplyr :: mutate(sum = sum(。[2:8]))






只是为了参考,我已经尝试了以下代码集,他们工作。我专门寻找一个使用 rowwise() sum()的解决方案。



选项1:
讨论于:总结所有列

  data%>%
dplyr :: rowwise()%>%
do(data.frame(。,res = sum(unlist(。)[2:8])))
/ pre>

选项2:

  rowSums(data [,2:8])

选项3:
讨论于:如何使用列索引与dplyr对所选列进行横向求和?

  data%>%mutate sum = Reduce(+,。[2:8]))

选项4 :

 数据%>%
select(2:8)%>%
dplyr :: mutate(sum = rowSums(。))


解决方案

这些列看起来可疑,就像观察....

如果是这样,整理数据框会使数据争吵明显更容易。



这是否为您提供了您正在寻求的答案?

  data%>%
gather(key = col,val = revenue,`col1`:`col7`)%>%
group_by(Fiscal.Year,No.Trans)% >%
summaryize(res = sum(revenue))

来源:本地数据框[10 x 3]
组:Fiscal.Year [?]

Fiscal.Year No.Trans res
< int> < INT> < DBL>
1 2016 366 9796823
2 2016 381 20119455
3 2016 388 0
4 2016 400 32861493
5 2016 677 18347259
6 2016 691 34052101
7 2016 796 12717074
8 2016 1002 33251606
9 2016 1292 44088865
10 2016 2987 35822182

为了真正顺利的思考,请仔细阅读 here 。他在演讲中讨论的功能已经更新,但Hadl​​ey做了一个很好的工作,教授这个主题:通过教学链接,因为它是。



可以找到更新的功能在他的 ggplot2 书籍这里一>。


I have a Tibble, and I have noticed that a combination of dplyr::rowwise() and sum() doesn't work. I know there are many threads on this topic, and I have got 2 to 3 solutions, but I am not quite why the combination of rowwise() and sum() doesn't work.

So, my question is : why doesn't a combination of rowwise() and sum() work AND what can we do to make it work? I am a beginner so I believe that I am doing something wrong in the below code.

Data:

dput(data)
structure(list(Fiscal.Year = c(2016L, 2016L, 2016L, 2016L, 2016L, 
2016L, 2016L, 2016L, 2016L, 2016L), col1 = c(0, 26613797.764311, 
0, 12717073.587292, 0, 0, 0, 0, 0, 0), col2 = c(0, 0, 0, 0, 8969417.89721166, 
0, 11483606.8417117, 0, 0, 0), col3 = c(0, 0, 33251606.347943, 
0, 25082683.4492186, 0, 17337191.3014127, 0, 0, 0), col4 = c(0, 
0, 0, 0, 0, 0, 0, 0, 0, 0), col5 = c(0, 0, 0, 0, 0, 0, 0, 0, 
0, 9796823.229998), col6 = c(35822181.695755, 17475066.870565, 
0, 0, 0, 0, 4040695.327278, 0, 13117249.623068, 0), col7 = c(0, 
0, 0, 0, 0, 18347258.910001, 0, 0, 7002205.087399, 0), No.Trans = c(2987L, 
1292L, 1002L, 796L, 691L, 677L, 400L, 388L, 381L, 366L)), .Names = c("Fiscal.Year", 
"col1", "col2", "col3", "col4", "col5", "col6", "col7", "No.Trans"
), row.names = c(NA, -10L), class = c("tbl_df", "tbl", "data.frame"
))

This code doesn't work:

data %>%  #No
        dplyr::rowwise() %>%
        dplyr::mutate(sum = sum(.[2:8]))


Just for reference, I have tried the following set of code, and they work. I am specifically looking for a solution that uses rowwise() and sum().

Option 1: Discussed at: Summarise over all columns

  data %>%
    dplyr::rowwise() %>%
    do(data.frame(., res = sum(unlist(.)[2:8])))

Option 2:

  rowSums(data[,2:8])

Option 3: Discussed at:How to do rowwise summation over selected columns using column index with dplyr?

  data %>% mutate(sum=Reduce("+",.[2:8]))

Option 4:

data %>%
        select(2:8)%>%
        dplyr::mutate(sum=rowSums(.))

解决方案

Those columns look suspiciously like observations....
If so, tidying that dataframe up would make the data wrangling significantly easier.

Does this get you the answers you are seeking?

data %>%
    gather(key = col, val = revenue, `col1`:`col7`) %>%
    group_by(Fiscal.Year, No.Trans) %>%
    summarise(res = sum(revenue))

Source: local data frame [10 x 3]
Groups: Fiscal.Year [?]

   Fiscal.Year No.Trans      res
         <int>    <int>    <dbl>
1         2016      366  9796823
2         2016      381 20119455
3         2016      388        0
4         2016      400 32861493
5         2016      677 18347259
6         2016      691 34052101
7         2016      796 12717074
8         2016     1002 33251606
9         2016     1292 44088865
10        2016     2987 35822182

For a really smooth introduction to thinking tidily please try here. The functions he discusses in the presentation have been updated, but Hadley does a great job teaching the subject: through pedagogical chaining, as it were.

The updated functions can be found in his ggplot2 book here.

这篇关于Tibble数据类型的Rowwise求和的文章就介绍到这了,希望我们推荐的答案对大家有所帮助,也希望大家多多支持IT屋!

查看全文
登录 关闭
扫码关注1秒登录
发送“验证码”获取 | 15天全站免登陆