如何知道在下一个订单的交货/接收之前下一个订单的客户?在R中 [英] How to know customers who placed next order before delivery/receiving of earlier order? In R

查看:46
本文介绍了如何知道在下一个订单的交货/接收之前下一个订单的客户?在R中的处理方法,对大家解决问题具有一定的参考价值,需要的朋友们下面随着小编来一起学习吧!

问题描述

我有一个拥有两个日期的大型数据库.例如.取得超级市场数据( http://www.tableau.com/sites/default/files/training/global_superstore.zip )订单"表.

I have a large database having two dates. E.g. Take superstore data (http://www.tableau.com/sites/default/files/training/global_superstore.zip) 'Orders' Sheet.

一个日期为订购日期,另一个日期为发货/交货日期(假设为交货日期).我想知道下一个订单而无需等待任何先前订单的发货/交付的那些客户的所有订单的详细信息.

One date is let's say date of Order and another is date of shipping/delivery (Assume it is delivery date). I want to know details of all orders of those customers who placed their next order without waiting for shipping/delivery of any one of their previous orders.

例如标识为"ZC-21910"的客户于2014年6月12日下了ID为CA-2014-133928的订单,该订单于2014年6月18日发货.但是,同一客户在6月13日下了ID为"IT-2014-3511710"的下一个订单.2014年,即2014年6月18日之前(先前订单之一的发货日期).

For e.g. Customer with ID 'ZC-21910' placed order with ID CA-2014-133928 on 12 June 2014 which was shipped on 18 June 2014. The same customer, however, placed next order with ID 'IT-2014-3511710' on 13 June 2014 i.e. before 18 June 2014 (shipping date of one of the earlier orders).

最好将所有此类订单(订单ID)过滤到单独的向量中.

It will be best all such orders (order IDs) are filtered out in a separate vector.

如何在R中做到这一点?还是在Tableau中?

How can I do it in R? or alternatively in Tableau?

示例数据集

> dput(df)
structure(list(customer_id = c("A", "A", "A", "B", "B", "C", 
"C"), order_id = structure(1:7, .Label = c("1", "2", "3", "4", 
"5", "6", "7"), class = "factor"), order_date = structure(c(17897, 
17901, 17912, 17901, 17902, 17903, 17905), class = "Date"), ship_date = structure(c(17926, 
17906, 17914, 17904, 17904, 17904, 17906), class = "Date")), row.names = c(NA, 
-7L), class = c("tbl_df", "tbl", "data.frame"))

推荐答案

我先前的回答未正确处理订购日期==发货日期"的情况.

My earlier answer did not properly handle the case where Order Date == Ship Date.

我假设您已经将数据加载到名为 df 的对象中.您可以使用@hello_friend的代码的第一部分来实现这一点.

I assume that you already loaded your data in an object called df. You can use the first part of @hello_friend's code to get this.

library(tidyverse)
df %>% 
  distinct(`Customer ID`, `Order ID`, `Order Date`, `Ship Date`) %>% 
  arrange(`Customer ID`, `Order Date`, `Ship Date`) %>% 
  mutate(sort_key = row_number()) %>% 
  pivot_longer(c(`Order Date`, `Ship Date`), names_to = "Activity", names_pattern = "(.*) Date", values_to = "Date") %>% 
  mutate(Activity = factor(Activity, ordered = TRUE, levels = c("Order", "Ship")), 
         Open = if_else(Activity == "Order", 1, -1)) %>% 
  group_by(`Customer ID`) %>% 
  arrange(Date, sort_key, Activity, .by_group = TRUE) %>% 
  mutate(Open = cumsum(Open)) %>% 
  ungroup %>% 
  filter(Open > 1, Activity == "Order") %>% 
  select(`Customer ID`, `Order ID`)

首先,仅获取不同的订单和客户ID,否则来自同一订单的多个商品会造成混乱,并导致错误的结果.然后,旋转数据,使每个订单变成两行,每行代表一个不同的活动:订购或装运.我们创建未结订单数量的总计.您正在寻找何时达到两个或更多.

First, take only distinct order and customer IDs, otherwise the multiple items from the same order will confuse things and cause an incorrect result. Then, pivot the data so that each order become two rows, each representing a distinct activity: either ordering or shipping. We create a running total of the number of open orders. You're looking for when this becomes two or more.

我对活动使用有序因素,以确保在关闭订单之前始终打开订单.当订单日期和发货日期相同时,这很重要.

I use an ordered factor for Activity to make sure that I always open an order before closing it. This matters when the order date and ship date are the same.

我使用特殊的sort_key列来确保在打开新订单之前,我关闭了旧订单,以防客户在同一天订购了其他产品.您可能需要相反的逻辑.

I use a special sort_key column to make sure that I close out the old order before opening a new one, in the cases when the customer orders on the same day that something else was shipped. You may want the reverse logic.

所有这些都假设给定的客户ID和订单ID在数据中仅出现一次,实际上在您的数据集中是不正确的,如您所见:

All of this assumes that a given Customer ID and Order ID only appear once in the data, which actually isn't true in your dataset, as you can see with:

df %>% group_by(`Customer ID`, `Order ID`) %>% filter(n_distinct(`Ship Date`)> 1) %>% select(1:9)

这篇关于如何知道在下一个订单的交货/接收之前下一个订单的客户?在R中的文章就介绍到这了,希望我们推荐的答案对大家有所帮助,也希望大家多多支持IT屋!

查看全文
登录 关闭
扫码关注1秒登录
发送“验证码”获取 | 15天全站免登陆