使用R获取范围和该范围内的随机天数 [英] Get the range and random days within that range using R

查看:144
本文介绍了使用R获取范围和该范围内的随机天数的处理方法,对大家解决问题具有一定的参考价值,需要的朋友们下面随着小编来一起学习吧!

问题描述

我有一个如下所示的数据帧

  test_df<-data.frame( subbject_id = c(1,2 ,3,4,5),
date_1 = c( 01/01/2003, 12/31/2007, 12/30/2008, 01 / 02/2007, 01/01/2007))
test_df = test_df%>%
mutate(date_1 = mdy(date_1),
previous_year = floor_date(date_1,' year'),
next_year = ceiling_date(date_1,'year')-1,
days_to_previous_year = as.integer(date_1-previous_year),
days_to_next_year = as.integer(next_year-date_1) ,
rand_days_prev_year = sample.int(days_to_previous_year,1),
rand_days_next_year = sample.int(days_to_next_year,1))%>%
select(-previous_year,-next_year)
b

感谢此

解决方案

这里是一种方法:

  library(dplyr)

test_df%>%
mutate(范围= sprintf(%d,%d,-days_to_previous_year,days_to_next_year))%> %% b $ b rowwise()%>%
mutate(rand_days = {days = -days_to_previous_year:days_to_next_year;
days = days [days!= 0]
if(length(days))sample(days,1)else NA
})

#subbject_id date_1 days_to_previous_year days_to_next_year范围rand_days
#< dbl> < date> < int> < int> < chr> < int>
#1 1 2003-01-01 0 364 0,364206
#2 2 2007-12-31 364 0 -364,0 -220
#3 3 2008-12-30 364 1 -364,1 -274
#4 4 2007年1月1日1 363 -1,363228228
#5 5 2007-01-01 0 364 0,364 72


I have a data frame like as shown below

test_df <- data.frame("subbject_id" = c(1,2,3,4,5), 
                      "date_1" = c("01/01/2003","12/31/2007","12/30/2008","01/02/2007","01/01/2007"))
test_df = test_df %>%
  mutate(date_1 = mdy(date_1), 
         previous_year = floor_date(date_1, 'year'), 
         next_year = ceiling_date(date_1, 'year') - 1, 
         days_to_previous_year = as.integer(date_1 - previous_year), 
         days_to_next_year = as.integer(next_year - date_1),
         rand_days_prev_year = sample.int(days_to_previous_year, 1),
         rand_days_next_year = sample.int(days_to_next_year, 1)) %>%
  select(-previous_year, -next_year)

Thanks to this post which helped me with the code to arrive at the part of the solution.

I would like to do two things

a) Get the range of values using days_to_prev_year and days_to_next_year. Note that days_to_prev_year has to have minus sign in front of it as shown in ouptut.

b) pick a random value within that range. Please note that if the range is [0,364], I want the random value to be between [1,364] inclusive. I don't want 0 as a random value. So, I would like to avoid 0 being chosen as a random value. Similarly, if it's [-11,21]. I don't want 0 to be chosen here as well but the rand value can be -11 or 21.

I tried the below statement but it doesn't work

range = paste0("[-",days_to_previous_year,",+",days_to_next_year,"]")
test_df$rand_days = sample.int(test_df$range, 1) # error as non-numeric

So, I tried using the below two numeric columns

test_df$rand_days_prev_year = sample.int(test_df$days_to_previous_year, 1) # this doesn't work
test_df$rand_days_next_year = sample.int(test_df$days_to_next_year, 1) # but this works

I get an error message like as shown below

Error in if (useHash) .Internal(sample2(n, size)) else .Internal(sample(n,  : 
  missing value where TRUE/FALSE needed

I expect my output to be like as shown below

解决方案

Here is one way :

library(dplyr)

test_df %>%
  mutate(range = sprintf("%d, %d", -days_to_previous_year, days_to_next_year)) %>%
  rowwise() %>%
  mutate(rand_days = {days = -days_to_previous_year:days_to_next_year;
                      days  = days[days != 0]
                      if(length(days)) sample(days, 1) else NA
                      })

#   subbject_id date_1     days_to_previous_year days_to_next_year range   rand_days
#        <dbl> <date>                     <int>             <int> <chr>       <int>
#1           1 2003-01-01                     0               364 0, 364        206
#2           2 2007-12-31                   364                 0 -364, 0      -220
#3           3 2008-12-30                   364                 1 -364, 1      -274
#4           4 2007-01-02                     1               363 -1, 363       228
#5           5 2007-01-01                     0               364 0, 364         72

这篇关于使用R获取范围和该范围内的随机天数的文章就介绍到这了,希望我们推荐的答案对大家有所帮助,也希望大家多多支持IT屋!

查看全文
登录 关闭
扫码关注1秒登录
发送“验证码”获取 | 15天全站免登陆