根据条件和groupby更新列 [英] Update a column based on a condition and groupby
问题描述
我的资料是:
Prod Vend Capac Dema Price
p1 v2 2 6 1
p1 v1 3 6 2
p1 v3 3 6 2
p2 v1 1 1 1
p2 v3 2 1 2
p2 v2 5 1 2
p3 v1 5 3 3
p3 v2 3 3 4
p3 v3 1 3 5
我需要一些东西如:
Prod Vend Capac Dema Price Source
p1 v2 2 6 1 2
p1 v1 3 6 2 3
p1 v3 3 6 2 1
p2 v1 1 1 1 1
p2 v3 2 1 2 0
p2 v2 5 1 2 0
p3 v1 5 3 3 3
p3 v2 3 3 4 0
p3 v3 1 3 5 0
$ b $我有产品,供应商,供应商的能力(产品的需求),价格(从供应商的产品)。我以最低价格采购产品,但供应商的能力是一个限制。也就是说,供应商的选择是按照引用的价格完成的,价格是按照产品分组的供应商排序的。
i have product, vendor, capacity (of vendor), demand (of product), price (of product from vendor). I am sourcing the products based on least price but vendor capacity is a constraint. That is, the selection of a vendor is done by price it has quoted, Price is sorted for the vendors grouped by products.
我正在尝试使用for循环,如果其他条件,代码越来越乱。有没有一个干净的方法来解决它,可能使用plyr?
I am trying it with for loop and if else conditions, the code is getting messier. Is there a clean way to solve it, probably using plyr?
推荐答案
这是我会做的:
library(data.table)
setDT(DT)
DT[order(Price), src := pmin(Capac, pmax(Dema - shift(cumsum(Capac), fill=0), 0)), by=Prod]
我们可以看到它匹配:
we can see it matches:
Prod Vend Capac Dema Price Source src
1: p1 v2 2 6 1 2 2
2: p1 v1 3 6 2 3 3
3: p1 v3 3 6 2 1 1
4: p2 v1 1 1 1 1 1
5: p2 v3 2 1 2 0 0
6: p2 v2 5 1 2 0 0
7: p3 v1 5 3 3 3 3
8: p3 v2 3 3 4 0 0
9: p3 v3 1 3 5 0 0
逻辑,部分是伪代码:
-
shift(cumsum(Capac),fill = 0)
是便宜的供应商的容量
shift(cumsum(Capac), fill=0)
is capacity from cheaper vendors
max(需求 - 上限)
max(demand - capacity from cheaper, 0)
is residual demand for the vendor
min(容量,剩余需求量)
是从供应商那里获取多少
min(capacity, residual demand)
is how much to source from the vendor
。
dplyr模拟:
DT %>% arrange(Price) %>% group_by(Prod) %>%
mutate(src = pmin(Capac, pmax(Dema - lag(cumsum(Capac), default=0), 0)))
这篇关于根据条件和groupby更新列的文章就介绍到这了,希望我们推荐的答案对大家有所帮助,也希望大家多多支持IT屋!