使用 R 在流程中的连续阶段之间处理事件 [英] Conting events between sequential stages in a process using R

查看:24
本文介绍了使用 R 在流程中的连续阶段之间处理事件的处理方法,对大家解决问题具有一定的参考价值,需要的朋友们下面随着小编来一起学习吧!

问题描述

我一直在尝试解决教科书中的一个练习,我面临的挑战是计算工业过程的连续阶段之间的不同事件.

I've been trying to resolve an exercise from a textbook where I am faced with the challenge of count different events between sequential stages of an industrial process.

过程相关信息: 一个测试对象经历了一个 3 个阶段的过程,分别是 A、B 和 C 阶段,第一个是 A、第二个 B 和最后一个 C;测试对象可能会在 A 或 B 阶段放弃该过程,然后从 A 点重新开始,每次该过程发生时,都会创建一个数据集,其中包含测试对象的 IDENTIFICATION、发生该阶段的 TIMESTAMP 和唯一的 VISIT_CODE在任何阶段,测试对象都可以触发警报".这将与 TIMESTAMP、ALERT_CODE 和测试对象 IDENTIFICATION 一起记录.

Information related to the process: A test Subject is made to go through a 3 stage process with stages A, B and C respectively the first one being A, second B and finally C; a test Subject may abandon the process at stages A or B and then start again from point A, each time the process takes place a dataset is created with the IDENTIFICATION of the test subject, the TIMESTAMP in which the stage took place and a unique VISIT_CODE During any stage, a test subject may trigger an "ALERT" and this will be recorded with the TIMESTAMP, ALERT_CODE and test subject IDENTIFICATION.

要计算的内容:我必须在 R 中创建一个代码来计算在阶段 A 和 B 之间、阶段 B 和 C 之间由测试对象生成的警报数量,以及最终在 C 之后生成的警报数量.请注意,测试对象可能在某个时候放弃这个过程,稍后从 A 点重新开始.

What is to be calculated: I have to create a code in R to count how many ALERTS where generated by the test subjects between stages A and B, between stages B and C and finally how many ALERTS where generated after C. Please note that a test subject may at some point abandon the process to later on start again from point A.

教科书给出了一个提示:仔细查看测试对象当前所处的阶段,然后确定 ALERT 是否是从阶段 A 和阶段 B 之前生成的,以及测试对象是否在阶段 B 之后和阶段 C 之前触发 ALERT,但保持在请注意,如果测试对象在 A 阶段放弃并在该 ALERT 的 TIMESTAMP 小于他们在 A 阶段的下一次尝试时触发 ALERT,则在 A 阶段之后应将其记为 ALERT"

The textbook gives a HINT: "Take a closer look at the stage a test subject is currently to then determine if the ALERT is generated from stage A and before a stage B and also if the test subject triggers an ALERT after stage B and before stage C, but keep in mind that if a test subject abandons at stage A and triggers an ALERT if the TIMESTAMP of that ALERT is smaller than their next attempt at stage A then it should be credited as an ALERT after a stage A"

作为另一个提示,教科书揭示了阶段 C 之后的警报只有 1,并且它是由测试对象 W-6 触发的,其 ALTER_CODE AYUJ-3915716168.数据集是:

as another hint, the textbook reveals that the ALERTS after Stage C where only 1 and that it was trigger by test subject W-6 with ALTER_CODE AYUJ-3915716168. The datasets are:

阶段过程

TableA<-tribble(~STAGE, ~TEST_SUBJECT,~TIMESTAMP,~VISIT_CODE,
"A",    "XYU-1",    "10",   "BKO",
"A",    "XYU-1",    "15",   "JUJD",
"B",    "XYU-1",    "20",   "DUDH",
"A",    "FF-09",    "25",   "KSIWJD",
"B",    "FF-09",    "30",   "AJAKAM",
"C",    "FF-09",    "35",   "ZISKS",
"A",    "UU-89",    "40",   "NNXJD",
"B",    "UU-89",    "45",   "DDUWO",
"A",    "I-44", "50",   "JIWIW",
"A",    "W-6",  "55",   "SHDN",
"B",    "W-6",  "60",   "IWOLS",
"C",    "W-6",  "65",   "JDDD",
"A",    "U-90", "70",   "DJDKSMS",
"B",    "U-90", "75",   "NDJSM",
"A",    "T-87", "80",   "DNDJDK")

警报数据集

TableB<-tribble(~TEST_SUBJECT,~TIMESTAMP,~ALERT_CODE,
"XYU-1",    "11",   "AYUJ-151571406",
"XYU-1",    "12",   "AYUJ-487008829",
"XYU-1",    "28",   "AYUJ-211990388",
"FF-09",    "32",   "AYUJ-4177221842",
"W-6",  "56",   "AYUJ-1300211351",
"W-6",  "63",   "AYUJ-3014305494",
"I-44", "67",   "AYUJ-4454800551",
"U-90", "73",   "AYUJ-1079921935",
"U-90", "76",   "AYUJ-3348911727",
"U-90", "79",   "AYUJ-2381219626",
"T-87", "82",   "AYUJ-4778326278",
"W-6",  "89",   "AYUJ-3915716168")

解决方案:

教科书指出这个问题的正确解决方案是:

The textbook states that the proper solution for this problem is:

<头>
阶段 A 和阶段之间的警报B 包括来自在阶段 A 的第 n 次尝试中放弃进程的测试对象的警报阶段 B 和阶段之间的警报C 包括来自在阶段 B 的第 n 次尝试中放弃该过程的测试对象的警报阶段 C 之后的警报
AYUJ-151571406AYUJ-211990388AYUJ-3915716168
AYUJ-487008829AYUJ-3014305494
AYUJ-1300211351AYUJ-3348911727
AYUJ-1079921935AYUJ-4177221842
AYUJ-4778326278AYUJ-2381219626
AYUJ-4454800551

我做了什么?:我尝试过分组和加入"TEST_SUBJECT 和阶段和时间戳的数据进行计数,但我发现很难设置条件以使代码计数阶段之间的事件,因为我发现很难确定测试对象在每个时间跨度之前和之后发生的事件和舞台.

What have I done? : I have tried grouping and "joining" the data by TEST_SUBJECT and stage and timestamp to make the counts but I and finding a really hard to set the conditions to make the code count events between stages because I am finding difficult to corelate which events took plave before and after each timespan by test subject and stage.

非常感谢你们所有有才华和酷的人的帮助或推荐

Thank you so much to all of you talented and cool people for your help or recomandations

推荐答案

这里是一个 data.table apprioach,在 a-b-c 之后产生一个警报列表..

Here is a data.table apprioach, resulting in a list of alerts after a-b-c..

library(data.table)
# Make tables data.table format
setDT(TableA)
setDT(TableB)
# set TiMESTAP to numeric
TableA[, TIMESTAMP := as.numeric(TIMESTAMP)]
TableB[, TIMESTAMP := as.numeric(TIMESTAMP)]
# Create data.table with Stage intervals by test subject
DT.interval <- TableA[, .(start = min(TIMESTAMP)), by = .(TEST_SUBJECT, STAGE)]
# Perform rolling join
TableB[, Stage := DT.interval[TableB, 
                              STAGE, 
                              on = .(TEST_SUBJECT, start = TIMESTAMP), 
                              roll = Inf]][]
# Split alerts by stage
split(TableB[,3:4], by = "Stage")
# $A
#         ALERT_CODE Stage
# 1:  AYUJ-151571406     A
# 2:  AYUJ-487008829     A
# 3: AYUJ-1300211351     A
# 4: AYUJ-4454800551     A
# 5: AYUJ-1079921935     A
# 6: AYUJ-4778326278     A
# 
# $B
#         ALERT_CODE Stage
# 1:  AYUJ-211990388     B
# 2: AYUJ-4177221842     B
# 3: AYUJ-3014305494     B
# 4: AYUJ-3348911727     B
# 5: AYUJ-2381219626     B
# 
# $C
#         ALERT_CODE Stage
# 1: AYUJ-3915716168     C

这篇关于使用 R 在流程中的连续阶段之间处理事件的文章就介绍到这了,希望我们推荐的答案对大家有所帮助,也希望大家多多支持IT屋!

查看全文
登录 关闭
扫码关注1秒登录
发送“验证码”获取 | 15天全站免登陆