R中的复杂算法与使用先前行值的data.tables [英] Complex algorithm in R with data.tables using previous rows values

查看:263
本文介绍了R中的复杂算法与使用先前行值的data.tables的处理方法,对大家解决问题具有一定的参考价值,需要的朋友们下面随着小编来一起学习吧!

问题描述

我有一个data.table形式的数据

I have my data in the form of a data.table given below

structure(list(atp = c(1, 0, 1, 0, 0, 1), len = c(2, NA, 3, NA, 
NA, 1), inv = c(593, 823, 668, 640, 593, 745), GU = c(36, 94, 
57, 105, 48, 67), RUTL = c(100, NA, 173, NA, NA, 7)), .Names = c("atp", 
"len", "inv", "GU", "RUTL"), row.names = c(NA, -6L), class = c("data.table", 
"data.frame"), .internal.selfref = <pointer: 0x0000000000320788>)

csi_begin,csi_end,IRQ和csi_order。当atp = 1时,csi_begin和csi_end的值直接取决于inv和gu的值。

I need to form 4 new columns csi_begin,csi_end, IRQ and csi_order. the value of csi_begin and csi_end when atp=1 depends directly on inv and gu values.

但是当atp不等于1 csi_begin和csi_end取决于前一行的inv和gu值和IRQ值
IRQ的值取决于csi_order该行if atp == 1 else其0和csi_order值取决于前两个行的csi_begin值。

But when atp is not equal to 1 csi_begin and csi_end depends on inv and gu values and IRQ value of previous row The value of IRQ depends on csi_order of that row if atp==1 else its 0 and csi_order value depends on two rows previous csi_begin value.

我已经使用for循环写了条件。
下面是给定的代码

I have written the condition with the help of for loop. Below is the code given

lostsales<-function(transit)
{

if (transit$atp==1)
{
  transit$csi_begin[i]<-(transit$inv)[i]
  transit$csi_end[i]<-transit$csi_begin[i]-transit$GU[i]
}
else
{
  transit$csi_begin[i]<-(transit$inv)[i]+transit$IRQ[i-1]
  transit$csi_end[i]<-transit$csi_begin[i]-transit$GU[i]
}
if (transit$csi_begin[i-2]!= NA)
{
  transit$csi_order[i]<-transit$csi_begin[i-2]
}
else
  { transit$csi_order[i]<-0}
if (transit$atp==1)
{
  transit$IRQ[i]<-transit$csi_order[i]-transit$RUTL[i] 
}

else
{
  transit$IRQ[i]<-0
}
}

任何人都可以帮助我如何使用setkeys与data.tables进行高效循环?因为我的数据集非常大,我不能使用for循环,否则时间会非常高。

Can anyone help me how to do efficient looping with data.tables using setkeys? As my data set is very large and I cannot use for loop else the timing would be very high.

推荐答案

添加所需的结果对你的例子将是非常有帮助,因为我有麻烦跟随if / then逻辑。但我还是刺了一下:

Adding the desired outcome to your example would be very helpful, as I'm having trouble following the if/then logic. But I took a stab at it anyway:

library(data.table)

# Example data:
dt <- structure(list(atp = c(1, 0, 1, 0, 0, 1), len = c(2, NA, 3, NA, NA, 1), inv = c(593, 823, 668, 640, 593, 745), GU = c(36, 94, 57, 105, 48, 67), RUTL = c(100, NA, 173, NA, NA, 7)), .Names = c("atp", "len", "inv", "GU", "RUTL"), row.names = c(NA, -6L), class = c("data.table", "data.frame"), .internal.selfref = "<pointer: 0x0000000000320788>")

# Add a row number:
dt[,rn:=.I]

# Use this function to get the value from a previous (shiftLen is negative) or future (shiftLen is positive) row:
rowShift <- function(x, shiftLen = 1L) {
  r <- (1L + shiftLen):(length(x) + shiftLen)
  r[r<1] <- NA
  return(x[r])
}

# My attempt to follow the seemingly circular if/then rules:
lostsales2 <- function(transit) {
  # If atp==1, set csi_begin to inv and csi_end to csi_begin - GU:
  transit[atp==1, `:=`(csi_begin=inv, csi_end=inv-GU)]

  # Set csi_order to the value of csi_begin from two rows prior:
  transit[, csi_order:=rowShift(csi_begin,-2)]

  # Set csi_order to 0 if csi_begin from two rows prior was NA
  transit[is.na(csi_order), csi_order:=0]

  # Initialize IRQ to 0
  transit[, IRQ:=0]

  # If ATP==1, set IRQ to csi_order - RUTL
  transit[atp==1, IRQ:=csi_order-RUTL]

  # If ATP!=1, set csi_begin to inv + IRQ value from previous row, and csi_end to csi_begin - GU
  transit[atp!=1, `:=`(csi_begin=inv+rowShift(IRQ,-1), csi_end=inv+rowShift(IRQ,-1)-GU)]
  return(transit)
}

lostsales2(dt)
##    atp len inv  GU RUTL rn csi_begin csi_end csi_order  IRQ
## 1:   1   2 593  36  100  1       593     557         0 -100
## 2:   0  NA 823  94   NA  2        NA      NA         0    0
## 3:   1   3 668  57  173  3       668     611       593  420
## 4:   0  NA 640 105   NA  4       640     535         0    0
## 5:   0  NA 593  48   NA  5       593     545       668    0
## 6:   1   1 745  67    7  6       745     678       640  633

这个输出是否接近您的预期?

Is this output close to what you were expecting?

这篇关于R中的复杂算法与使用先前行值的data.tables的文章就介绍到这了,希望我们推荐的答案对大家有所帮助,也希望大家多多支持IT屋!

查看全文
登录 关闭
扫码关注1秒登录
发送“验证码”获取 | 15天全站免登陆