R - 数据框 - 按列分组求和 [英] R - dataframe - sum on group by columns

查看:80
本文介绍了R - 数据框 - 按列分组求和的处理方法,对大家解决问题具有一定的参考价值,需要的朋友们下面随着小编来一起学习吧!

问题描述

我尝试了以下代码,它可以按月份和变量列对金额求和.

I tried the below codes and it can sum amount by columns month and variable.

test_dplyr = read.csv("test_dplyr.csv", header=TRUE)
test_dplyr

test_dplyr %>%
  group_by(month, variable) %>%
  summarise(a_sum=sum(amount))

> test_dplyr = read.csv("test_dplyr.csv", header=TRUE)
> test_dplyr
      month variable amount
1 1/11/2018        x   1000
2 1/11/2018        x   3000
3 1/12/2018        y   5000
4 1/12/2018        y   3000
> 
> test_dplyr %>%
+   group_by(month, variable) %>%
+   summarise(a_sum=sum(amount))
# A tibble: 2 x 3
# Groups: month [?]
  month     variable a_sum
  <fctr>    <fctr>   <int>
1 1/11/2018 x         4000
2 1/12/2018 y         8000

但是,当我尝试对我的 trade_test 数据执行相同操作时,我无法像使用此工作代码一样获得预期的输出.感谢是否有人可以建议我无法获得与上述示例相同的预期输出是什么问题.

However, when i tried to do the same with my trade_test data, i couldn't get the expected output as with this working code. Appreciate if anyone can advise what's wrong that i couldn't get the same expected output as with above example.

谢谢.

trades_test = read.csv("trades_test.csv", header=TRUE)
trades_test

trades_test %>%
  group_by(Trade_date, Country_code) %>%
  summarise(a_sum=sum(Trade_value_local))

> trades_test = read.csv("trades_test.csv", header=TRUE)
> trades_test
     Sedol                   Description Trans_type Trade_date  Quantity Price_local CCY_local Trade_value_local Trade_type Country_code
1  B01NPJ1 TATA CONSULTANCY SERVICES LTD        BUY  11-Jan-18    38,164       40.88       INR         1,560,044    Buy New           IN
2  B012W42                 PUBLIC BK BHD        BUY  11-Jan-18   221,400        4.92       MYR         1,089,969   Buy More           MY
3  6288190            AU OPTRONICS CORP.        BUY  11-Jan-18 2,210,000        0.42       TWD           923,639    Buy New           TW
4  6491318            KINGBOARD CHEMICAL        BUY  11-Jan-18   138,500        5.54       HKD           767,200    Buy New           HK
5  6205122                   INFOSYS LTD        BUY  12-Jan-18    48,855       15.30       INR           747,548    Buy New           IN
6  6196152                 CITIC LIMITED       SELL  12-Jan-18   -81,000        1.41       HKD          -113,985  Sell Some           HK
7  6451055              HYUNDAI MOTOR CO       SELL  11-Jan-18      -786      147.42       KRW          -115,870   Sell All           KR
8  6868398              TELEKOM MALAYSIA       SELL  12-Jan-18   -83,100        1.47       MYR          -122,119  Sell Some           MY
9  6243586                      SATS LTD       SELL  11-Jan-18   -33,500        3.90       SGD          -130,632  Sell Some           SG
10 6253767               INDIAN OIL CORP       SELL  13-Jan-18   -21,571        6.06       INR          -130,824   Sell All           IN
> 
> trades_test %>%
+   group_by(Trade_date, Country_code) %>%
+   summarise(a_sum=sum(Trade_value_local))
Error in summarise_impl(.data, dots) : 
  Evaluation error: <U+0091>sum<U+0092> not meaningful for factors.

推荐答案

输入 class(trades_test$Trade_value_local),你会看到这是一个 factor,而不是一个数字,并总结 factors 没有意义.因此,您必须先将其转换为数字,方法是去掉逗号,然后解析为 numeric.你可以这样做:

Type class(trades_test$Trade_value_local) and you'll see that is a factor, not a numeric, and summing factors does not make sense. So you will have to convert it to a number first, by getting rid of the commas and then parsing to numeric. You could do that as follows:

trades_test %>%
  mutate(Trade_value_local = as.numeric(gsub(',','',Trade_value_local)))
  group_by(Trade_date, Country_code) %>%
  summarise(a_sum=sum(Trade_value_local))

这篇关于R - 数据框 - 按列分组求和的文章就介绍到这了,希望我们推荐的答案对大家有所帮助,也希望大家多多支持IT屋!

查看全文
登录 关闭
扫码关注1秒登录
发送“验证码”获取 | 15天全站免登陆