R柱过滤 [英] R column filtration
问题描述
我一直在处理一些数据,我正在尝试根据特定的行过滤列,但到目前为止,我一直没有成功.有人可以帮我吗?让我解释一下我要达到的目标.我有一个显示以下信息的数据集
I have been working with some data at my work and I am trying to filter columns based on specific rows but I have been unsuccessful so far. Can anyone please help me out? Let me explain what I am trying to achieve. I have a dataset which displays the following information
person_id|custody_start|custody_end|contact_month|month_start |month_end |contact_date
13126321 |02/23/2020 |07/17/2020 |February 20 |03/01/2020 |02/28/2020|26/02/2020
13126321 |02/23/2020 |07/17/2020 |March 20 |03/01/2020 |03/31/2020|12/03/2020
13126321 |02/23/2020 |07/17/2020 |April 20 |04/01/2020 |04/30/2020|11/04/2020
13126321 |02/23/2020 |07/17/2020 |May 20 |05/01/2020 |05/31/2020|12/05/2020
13126321 |02/23/2020 |07/17/2020 |June 20 |06/01/2020 |06/30/2020|11/06/2020
13126321 |02/23/2020 |07/17/2020 |July 20 |07/01/2020 |07/31/2020|12/07/2020
数据多次显示相同的记录,但它是每个月建立联系的记录以及每个月建立联系的日期.基本上,我要在这里实现的目的是过滤出整个日历月(从每月的1号到30号或31号)该人都没有被关押的栏,我们从任何日期开始都不看30天,但是日历月份.因此,在2月20日和7月20日,该人没有被整个月羁押,因为您可以看到该人在2月23日进入了羁押,在7月17日离开了羁押.因此,在这种情况下,第一个月和最后一个月不计算在内.每个person_id都有多个这样的记录,因此我不能只删除每个孩子的第一列和最后一列.我只需要保留该人在整个日历月内被拘留的记录
The data displays the same record multiple times but it is a record of the contact made each month and the date the contact was made each month. Basically what I am trying to achieve here is to filter out the column where the person was not in custody for the whole calendar month (from the 1st until the 30th or 31st of each month) we are not looking at 30 days from any date, but the calendar month. So in Feb 20 and July 20 the person was not in custody for the whole month as you can see the person entered custody on the 23rd of Feb and left custody on the 17th of July. So the first and the last month in this case won't count. There are multiple records like this for each person_id so I can not just remove the first and the last column of each child. I just need to keep the records where the person stayed in custody for the whole calendar month
我的最终结果应该是这样
My final result should look something like this
person_id|custody_start|custody_end|contact_month|month_start |month_end |contact_date
26321 |02/23/2020 |07/17/2020 |March 20 |03/01/2020 |03/31/2020|12/03/2020
26321 |02/23/2020 |07/17/2020 |April 20 |04/01/2020 |04/30/2020|11/04/2020
26321 |02/23/2020 |07/17/2020 |May 20 |05/01/2020 |05/31/2020|12/05/2020
26321 |02/23/2020 |07/17/2020 |June 20 |06/01/2020 |06/30/2020|11/06/2020
我将不胜感激.谢谢
推荐答案
像这样吗?
dat = read.table(text='person_id|custody_start|custody_end|contact_month|month_start |month_end |contact_date
13126321 |02/23/2020 |07/17/2020 |February 20 |02/01/2020 |02/28/2020|26/02/2020
13126321 |02/23/2020 |07/17/2020 |March 20 |03/01/2020 |03/31/2020|12/03/2020
13126321 |02/23/2020 |07/17/2020 |April 20 |04/01/2020 |04/30/2020|11/04/2020
13126321 |02/23/2020 |07/17/2020 |May 20 |05/01/2020 |05/31/2020|12/05/2020
13126321 |02/23/2020 |07/17/2020 |June 20 |06/01/2020 |06/30/2020|11/06/2020
13126321 |02/23/2020 |07/17/2020 |July 20 |07/01/2020 |07/31/2020|12/07/2020',sep="|",header = TRUE)
dat %>%
mutate_at(vars(contains("custody"),contains("month_")),
function(x) as.character(x) %>% mdy(.)) %>%
mutate(contact_date = dmy(as.character(contact_date))) %>%
dplyr::filter(month_start >= custody_start & month_end <= custody_end)
person_id custody_start custody_end contact_month month_start month_end contact_date
1 13126321 2020-02-23 2020-07-17 March 20 2020-03-01 2020-03-31 2020-03-12
2 13126321 2020-02-23 2020-07-17 April 20 2020-04-01 2020-04-30 2020-04-11
3 13126321 2020-02-23 2020-07-17 May 20 2020-05-01 2020-05-31 2020-05-12
4 13126321 2020-02-23 2020-07-17 June 20 2020-06-01 2020-06-30 2020-06-11
这篇关于R柱过滤的文章就介绍到这了,希望我们推荐的答案对大家有所帮助,也希望大家多多支持IT屋!