有效地生成两个日期之间的时间和日期的随机抽样 [英] efficiently generate a random sample of times and dates between two dates

查看:152
本文介绍了有效地生成两个日期之间的时间和日期的随机抽样的处理方法,对大家解决问题具有一定的参考价值,需要的朋友们下面随着小编来一起学习吧!

问题描述

我写了一个(相当幼稚的)函数来随机选择两个指定日期之间的日期/时间

I have written a (fairly naive) function to randomly select a date/time between two specified days

# set start and end dates to sample between
day.start <- "2012/01/01"
day.end <- "2012/12/31"

# define a random date/time selection function
rand.day.time <- function(day.start,day.end,size) {
  dayseq <- seq.Date(as.Date(day.start),as.Date(day.end),by="day")
  dayselect <- sample(dayseq,size,replace=TRUE)
  hourselect <- sample(1:24,size,replace=TRUE)
  minselect <- sample(0:59,size,replace=TRUE)
  as.POSIXlt(paste(dayselect, hourselect,":",minselect,sep="") )
}

其中:

> rand.day.time(day.start,day.end,size=3)
[1] "2012-02-07 21:42:00" "2012-09-02 07:27:00" "2012-06-15 01:13:00"

但这似乎正在慢下来样本大小上升。

But this seems to be slowing down considerably as the sample size ramps up.

# some benchmarking
> system.time(rand.day.time(day.start,day.end,size=100000))
   user  system elapsed 
   4.68    0.03    4.70 
> system.time(rand.day.time(day.start,day.end,size=200000))
   user  system elapsed 
   9.42    0.06    9.49 

有人能够以更有效的方式建议如何做这样的事情吗?

Is anyone able to suggest how to do something like this in a more efficient manner?

推荐答案

啊,另一个日期/时间问题,我们可以减少到浮动工作:)

Ahh, another date/time problem we can reduce to working in floats :)

尝试此功能

R> latemail <- function(N, st="2012/01/01", et="2012/12/31") {
+     st <- as.POSIXct(as.Date(st))
+     et <- as.POSIXct(as.Date(et))
+     dt <- as.numeric(difftime(et,st,unit="sec"))
+     ev <- sort(runif(N, 0, dt))
+     rt <- st + ev
+ }
R>

我们以秒为单位计算 difftime 然后只是画制服,排序结果。添加到开始,你完成了:

We compute the difftime in seconds, and then "merely" draw uniforms over it, sorting the result. Add that to the start and you're done:

R> set.seed(42); print(latemail(5))     ## round to date, or hour, or ...
[1] "2012-04-14 05:34:56.369022 CDT" "2012-08-22 00:41:26.683809 CDT" 
[3] "2012-10-29 21:43:16.335659 CDT" "2012-11-29 15:42:03.387701 CST"
[5] "2012-12-07 18:46:50.233761 CST"
R> system.time(latemail(100000))
   user  system elapsed 
  0.024   0.000   0.021 
R> system.time(latemail(200000))
   user  system elapsed 
  0.044   0.000   0.045 
R> system.time(latemail(10000000))   ## a few more than in your example :)
   user  system elapsed 
  3.240   0.172   3.428 
R> 

这篇关于有效地生成两个日期之间的时间和日期的随机抽样的文章就介绍到这了,希望我们推荐的答案对大家有所帮助,也希望大家多多支持IT屋!

查看全文
登录 关闭
扫码关注1秒登录
发送“验证码”获取 | 15天全站免登陆