基于条件的复杂序列 [英] Complex sequence based on a condition

查看:39
本文介绍了基于条件的复杂序列的处理方法,对大家解决问题具有一定的参考价值,需要的朋友们下面随着小编来一起学习吧!

问题描述

我正在尝试将数据转换为data2.我正在寻找基本的R或dplyr解决方案.每个策略都有一个ID.有一个开始日期和一个结束日期.这些都给了.保单年度从开始日期开始,到一年后结束.一项政策可能会持续数年.保单的第一部分需要将PolYr的值设置为0.当保单年度进入下一年时,PolYr的取值为1.

带有条件的数字序列

对于每个合约,每个PolYr和CaldYr组合都有一行.我还需要确定CaldYr.查看ID = 103,我们看到合同从2011年开始,其第一行的PolYr = 0且CaldYr =2011.PolYr 0的第二部分进入2012年,因此ID = 103的第二行的PolYr = 1和CaldYr =2012.此政策的期限超过2年,于2013年底完成,因此适用于五行.

以下是数据帧之前和之后.我做了一些研究,但没有发现我认为与我的问题相对应的任何东西.

 库(dplyr)ID = c(101,rep(102,2),rep(103,5))start = as.Date(c('2/1/2010',rep('5/17/2011',2),rep('5/17/2011',5)),'%m/%d/%Y')end = as.Date(c('3/5/2010',rep('1/4/2012',2),rep('8/4/2013',5)),'%m/%d/%Y')数据= data.frame(ID = ID,开始=开始,结束=结束)v = c(0,1)数据=数据%>%group_by(ID)%>%mutate(PolYr = rep_len(v,length(ID)))数据data2 =数据data2 $ CaldrYr = c(2010、2011、2012、2011、2012、2012、2013、2013)数据2 

解决方案

有了 data.table ,我们可以做到

 库(data.table)库(润滑)setDT(data)[,CaldrYr:= year(start)+ cumsum(PolYr),ID] 

I am trying to transform data into data2. I am looking for a base R or dplyr solution. There is an ID associated with each policy. There is a start date and an end date. These are all given. A policy year starts on the start date and ends one year later. A policy may go for several years. The first part of a policy needs to have a PolYr value of 0. When the policy year goes into the next year, PolYr takes the value 1. I was able to figure that out via

Numeric sequence with condition

For each contract, there is a row for each PolYr and CaldYr combination. I also need to determine the CaldYr. Looking at ID = 103, we see that the contract starts in 2011, its first row will have PolYr = 0 and CaldYr = 2011. The second part of PolYr 0 goes into 2012 so the second row for ID = 103 will have PolYr = 1 and CaldYr = 2012. This policy is more than 2 years in length and finishes in late 2013 so it goes for five rows.

Below are before and after data frames. I did some research, but did not find anything that I perceived as corresponding to my problem.

library(dplyr)    
ID = c(101, rep(102, 2), rep(103,5))
    start = as.Date(c('2/1/2010', rep('5/17/2011', 2), rep('5/17/2011', 5)), '%m/%d/%Y')
    end = as.Date(c('3/5/2010', rep('1/4/2012', 2 ), rep('8/4/2013', 5 )), '%m/%d/%Y')
    data = data.frame(ID = ID, start = start, end = end)

    v = c(0,1)
    data = data %>% group_by(ID) %>% mutate(PolYr = rep_len(v, length(ID)))
    data

    data2 = data
    data2$CaldrYr = c(2010, 2011, 2012, 2011, 2012, 2012, 2013, 2013)
    data2

解决方案

With data.table, we can do

library(data.table)
library(lubridate)
setDT(data)[, CaldrYr := year(start) + cumsum(PolYr), ID]

这篇关于基于条件的复杂序列的文章就介绍到这了,希望我们推荐的答案对大家有所帮助,也希望大家多多支持IT屋!

查看全文
登录 关闭
扫码关注1秒登录
发送“验证码”获取 | 15天全站免登陆