根据现有列的条件创建新列 [英] Creating new columns based on conditions of the existing columns

查看:60
本文介绍了根据现有列的条件创建新列的处理方法,对大家解决问题具有一定的参考价值,需要的朋友们下面随着小编来一起学习吧!

问题描述

我有一个data.table,其中包含以下值

I have a data.table which contains the following values

structure(list(Item = c("10040", "10040", "10110", "10190", "10190", 
"12020", "12970", "7010040"), Customer = c("CLUB RETAILER - 106", 
"WHOLESALER - 112", "NATIONAL RETAILER - 102", "MID WEST 1 - 120", 
"WHOLESALER - 112", "WHOLESALER - 112", "WHOLESALER - 112", "DIAMLER BENZ - 108"
), DemandID = c("NetBaseForecast", "NetBaseForecast", "NetBaseForecast", 
"NetBaseForecast", "NetBaseForecast", "NetBaseForecast", "NetBaseForecast", 
"NetBaseForecast"), Forecasts = c(5, 158, 212, 12, 3, 3, 3, 2776
), RTF = c(1, 98, 139, 8, 0, 0, 0, 2356)), row.names = c(NA, 
-8L), class = c("data.table", "data.frame"), .internal.selfref = <pointer: 0x00000000025c1ef0>,     sorted = c("Item", "Customer"))

我需要基于Forecast和RTF列的每一行的2个条件,根据Forecast和rtf列创建2个新列.我正在使用data.table

I need to create 2 new columns based on the forecast and rtf columns based on 2 conditions of each row for the Forecasts and RTF columns. I am using a data.table

  1. 如果预测>RTF
    • New_F =预测-RTF
    • Com_F = RTF
  • New_F = 0
  • Com_F =预测

我目前正在使用这样的for循环

I am currently using a for loop like this

for(i in 1:nrow(temp1NoOrders)){
  if (temp1NoOrders$Forecasts[i] >= temp1NoOrders$RTF[i]){
   temp1NoOrders$Com_F[i] <- temp1NoOrders$RTF[i]
   temp1NoOrders$New_F[i] <- temp1NoOrders$Forecasts[i] - temp1NoOrders$RTF[i]
 }
  else if (temp1NoOrders$Forecasts[i] < temp1NoOrders$RTF[i]){
   temp1NoOrders$Com_F[i] <- temp1NoOrders$Forecast[i]
   temp1NoOrders$New_F[i] <- 0
 }
}

是否有使用data.table进行此操作的更快方法?我假设如果使用100k行数据集,由于循环和if-else语句,这可能会变慢.

Is there a faster way of doing this with data.table? I am assuming that if I use a 100k row dataset, this might become slower because of the loop and the if-else statements.

推荐答案

您无需一行一行地执行此操作.使用 data.table 中的 fifelse 检查条件.

You don't need to do this line by line. Use fifelse in data.table to check for conditions.

library(data.table)

temp1NoOrders[, c("New_F","Com_F") := 
              .(fifelse(Forecasts > RTF, Forecasts - RTF, 0), 
                fifelse(Forecasts > RTF, RTF, Forecasts))] 

temp1NoOrders
#      Item                Customer        DemandID Forecasts  RTF New_F Com_F
#1:   10040     CLUB RETAILER - 106 NetBaseForecast         5    1     4     1
#2:   10040        WHOLESALER - 112 NetBaseForecast       158   98    60    98
#3:   10110 NATIONAL RETAILER - 102 NetBaseForecast       212  139    73   139
#4:   10190        MID WEST 1 - 120 NetBaseForecast        12    8     4     8
#5:   10190        WHOLESALER - 112 NetBaseForecast         3    0     3     0
#6:   12020        WHOLESALER - 112 NetBaseForecast         3    0     3     0
#7:   12970        WHOLESALER - 112 NetBaseForecast         3    0     3     0
#8: 7010040      DIAMLER BENZ - 108 NetBaseForecast      2776 2356   420  2356

fifelse 很容易理解,但更有效的方法是使用 pmax pmin :

fifelse is easy to understand but more efficient approach would be to use pmax and pmin :

temp1NoOrders[, c("New_F","Com_F") := .(pmax(Forecasts - RTF, 0), 
                                        pmin(Forecasts, RTF))]

这篇关于根据现有列的条件创建新列的文章就介绍到这了,希望我们推荐的答案对大家有所帮助,也希望大家多多支持IT屋!

查看全文
登录 关闭
扫码关注1秒登录
发送“验证码”获取 | 15天全站免登陆