沿 POSIX 序列提高 sapply 的速度 [英] Speed improvement for sapply along a POSIX sequence

查看:22
本文介绍了沿 POSIX 序列提高 sapply 的速度的处理方法,对大家解决问题具有一定的参考价值,需要的朋友们下面随着小编来一起学习吧!

问题描述

我正在沿着 POSIX 序列进行迭代,以确定给定时间的并发事件数量完全使用本问题中描述的方法和相应的答案:

I am iterating along a POSIX sequence to identify the number of concurrent events at a given time with exactly the method described in this question and the corresponding answer:

如何数数使用时间间隔数据的并发用户数?

我的问题是我以分钟为单位的 tinterval 序列涵盖了一年,这意味着它有 523.025 个条目.另外,我也在考虑秒的解决方案,这会更糟.

My problem is that my tinterval sequence in minutes covers a year, which means it has 523.025 entries. In addition, I am also thinking about a resolution in seconds, which would make thinks even worse.

我可以做些什么来改进此代码(例如,相关性输入数据 (tdata) 中日期间隔的顺序?)或者我是否必须接受性能?喜欢在 R 中找到解决方案?

Is there anything I can do to improve this code (e.g. is the order of the date intervals from the input data (tdata) of relevance?) or do I have to accept the performance if I like to have a solution in R?

推荐答案

您可以尝试使用 data.tables 新的 foverlaps 功能.使用另一个问题的数据:

You could try using data.tables new foverlaps function. With the data from the other question:

library(data.table)
setDT(tdata)
setkey(tdata, start, end)
minutes <- data.table(start = seq(trunc(min(tdata[["start"]]), "mins"), 
                                  round(max(tdata[["end"]]), "mins"), by="min"))
minutes[, end := start+59]
setkey(minutes, start, end)
DT <- foverlaps(tdata, minutes, type="any")
counts <- DT[, .N, by=start]
plot(N~start, data=counts, type="s")

我没有为大量数据计时.自己试试.

I haven't timed this for huge data. Try yourself.

这篇关于沿 POSIX 序列提高 sapply 的速度的文章就介绍到这了,希望我们推荐的答案对大家有所帮助,也希望大家多多支持IT屋!

查看全文
登录 关闭
扫码关注1秒登录
发送“验证码”获取 | 15天全站免登陆