R Data.table基于另一列划分列中的值 [英] R Data.table divide values in column based on another column

查看:36
本文介绍了R Data.table基于另一列划分列中的值的处理方法,对大家解决问题具有一定的参考价值,需要的朋友们下面随着小编来一起学习吧!

问题描述

我有一个数据表,该表具有364行和3列:

I have a main data.table which has 364 rows and the 3 columns:

Date        Weekday     Weight
2012-01-01  Monday      100
2013-01-02  Tuesday     200
...

和一个具有7行2列的帮助 data.table:

and a help data.table with 7 rows 2 columns:

Weekday   Coefficient
Monday    0.91
Tuesday   0.84
Wednesday 0.99
...

现在,我想在主data.table中创建第4列,并基于工作日"添加权重/系数".

Now i would like to create a 4th column in the main data.table with the "weight/Coefficient" based on the Weekday.

Weight_divided <- main[, Weight * help[Weekday==main$Weekday]$Coefficient]

结果如下:

Date        Weekday     Weight   Weight_divided
2012-01-01  Monday      100      91
2013-01-02  Tuesday     200      168
2012-01-03  Wednesday   300      297
2012-01-04  Thursday    400      256
2012-01-05  Friday      500      399
2012-01-06  Saturday    600      410
2012-01-07  Sunday      700      680
2012-01-08  Monday      300      NA     <--
2012-01-09  Tuesday     600      NA     <--
...

我想问题是两个data.tables的长度都不同.有没有一种方法可以在主data.table操作中引用它与较短的data.table一起工作?

I guess the issue is that the length of both data.tables is different. Is there a way how to reference in the main data.table operation that this works with a shorter data.table?

推荐答案

使用 data.table

library(data.table)
setkey(main, Weekday)[help, Weight_Coef := Weight*Coefficient][order(Date)]
  #      Weekday       Date Weight Weight_Coef
  # 1:    Monday 2012-01-01     59       53.69
  # 2:   Tuesday 2012-01-02     45       37.80
  # 3: Wednesday 2012-01-03    141      139.59
  # 4:  Thursday 2012-01-04    104       97.76
  # 5:    Friday 2012-01-05    133      109.06
  #---                                        
  #360: Wednesday 2012-12-25    192      190.08
  #361:  Thursday 2012-12-26     79       74.26
  #362:    Friday 2012-12-27     39       31.98
  #363:  Saturday 2012-12-28    175      148.75
  #364:    Sunday 2012-12-29    134      116.58

数据

set.seed(24)
main <- data.table(Weekday=rep(c('Monday', 'Tuesday', 'Wednesday',
'Thursday', 'Friday', 'Saturday', 'Sunday'), length.out=364),
Date=seq(as.Date('2012-01-01'), length.out=364, by='day'), 
Weight=sample(200, 364, replace=TRUE))

help <- data.table(Weekday=c('Monday', 'Tuesday', 'Wednesday',
'Thursday', 'Friday', 'Saturday', 'Sunday'), Coefficient=c(0.91, 0.84, 
 0.99, 0.94, 0.82, 0.85, 0.87))

这篇关于R Data.table基于另一列划分列中的值的文章就介绍到这了,希望我们推荐的答案对大家有所帮助,也希望大家多多支持IT屋!

查看全文
登录 关闭
扫码关注1秒登录
发送“验证码”获取 | 15天全站免登陆