R Data.table基于另一列划分列中的值 [英] R Data.table divide values in column based on another column
本文介绍了R Data.table基于另一列划分列中的值的处理方法,对大家解决问题具有一定的参考价值,需要的朋友们下面随着小编来一起学习吧!
问题描述
我有一个主数据表,该表具有364行和3列:
I have a main data.table which has 364 rows and the 3 columns:
Date Weekday Weight
2012-01-01 Monday 100
2013-01-02 Tuesday 200
...
和一个具有7行2列的帮助 data.table:
and a help data.table with 7 rows 2 columns:
Weekday Coefficient
Monday 0.91
Tuesday 0.84
Wednesday 0.99
...
现在,我想在主data.table中创建第4列,并基于工作日"添加权重/系数".
Now i would like to create a 4th column in the main data.table with the "weight/Coefficient" based on the Weekday.
Weight_divided <- main[, Weight * help[Weekday==main$Weekday]$Coefficient]
结果如下:
Date Weekday Weight Weight_divided
2012-01-01 Monday 100 91
2013-01-02 Tuesday 200 168
2012-01-03 Wednesday 300 297
2012-01-04 Thursday 400 256
2012-01-05 Friday 500 399
2012-01-06 Saturday 600 410
2012-01-07 Sunday 700 680
2012-01-08 Monday 300 NA <--
2012-01-09 Tuesday 600 NA <--
...
我想问题是两个data.tables的长度都不同.有没有一种方法可以在主data.table操作中引用它与较短的data.table一起工作?
I guess the issue is that the length of both data.tables is different. Is there a way how to reference in the main data.table operation that this works with a shorter data.table?
推荐答案
使用 data.table
library(data.table)
setkey(main, Weekday)[help, Weight_Coef := Weight*Coefficient][order(Date)]
# Weekday Date Weight Weight_Coef
# 1: Monday 2012-01-01 59 53.69
# 2: Tuesday 2012-01-02 45 37.80
# 3: Wednesday 2012-01-03 141 139.59
# 4: Thursday 2012-01-04 104 97.76
# 5: Friday 2012-01-05 133 109.06
#---
#360: Wednesday 2012-12-25 192 190.08
#361: Thursday 2012-12-26 79 74.26
#362: Friday 2012-12-27 39 31.98
#363: Saturday 2012-12-28 175 148.75
#364: Sunday 2012-12-29 134 116.58
数据
set.seed(24)
main <- data.table(Weekday=rep(c('Monday', 'Tuesday', 'Wednesday',
'Thursday', 'Friday', 'Saturday', 'Sunday'), length.out=364),
Date=seq(as.Date('2012-01-01'), length.out=364, by='day'),
Weight=sample(200, 364, replace=TRUE))
help <- data.table(Weekday=c('Monday', 'Tuesday', 'Wednesday',
'Thursday', 'Friday', 'Saturday', 'Sunday'), Coefficient=c(0.91, 0.84,
0.99, 0.94, 0.82, 0.85, 0.87))
这篇关于R Data.table基于另一列划分列中的值的文章就介绍到这了,希望我们推荐的答案对大家有所帮助,也希望大家多多支持IT屋!
查看全文