帮我用“应用"替换 for 循环功能 [英] Help me replace a for loop with an "apply" function
本文介绍了帮我用“应用"替换 for 循环功能的处理方法,对大家解决问题具有一定的参考价值,需要的朋友们下面随着小编来一起学习吧!
问题描述
...如果可以的话
我的任务是找出用户参与游戏的最长连续天数.
My task is to find the longest streak of continuous days a user participated in a game.
我没有编写 sql 函数,而是选择使用 R 的 rle 函数,以获得最长的连续记录,然后用结果更新我的数据库表.
Instead of writing an sql function, I chose to use the R's rle function, to get the longest streaks and then update my db table with the results.
(附加的)数据框是这样的:
The (attached) dataframe is something like this:
day user_id
2008/11/01 2001
2008/11/01 2002
2008/11/01 2003
2008/11/01 2004
2008/11/01 2005
2008/11/02 2001
2008/11/02 2005
2008/11/03 2001
2008/11/03 2003
2008/11/03 2004
2008/11/03 2005
2008/11/04 2001
2008/11/04 2003
2008/11/04 2004
2008/11/04 2005
我尝试了以下方法来获得每个用户最长的连续记录
I tried the following to get per user longest streak
# turn it to a contingency table
my_table <- table(user_id, day)
# get the streaks
rle_table <- apply(my_table,1,rle)
# verify the longest streak of "1"s for user 2001
# as.vector(tapply(rle_table$'2001'$lengths, rle_table$'2001'$values, max)["1"])
# loop to get the results
# initiate results matrix
res<-matrix(nrow=dim(my_table)[1], ncol=2)
for (i in 1:dim(my_table)[1]) {
string <- paste("as.vector(tapply(rle_table$'", rownames(my_table)[i], "'$lengths, rle_table$'", rownames(my_table)[i], "'$values, max)['1'])", sep="")
res[i,]<-c(as.integer(rownames(my_table)[i]) , eval(parse(text=string)))
}
不幸的是,这个 for 循环花费的时间太长,我想知道是否有办法使用apply"系列中的函数生成 res 矩阵.
Unfortunately this for loop takes too long and I' wondering if there is a way to produce the res matrix using a function from the "apply" family.
先谢谢你
推荐答案
另一个选项
# convert to Date
day_table$day <- as.Date(day_table$day, format="%Y/%m/%d")
# split by user and then look for contiguous days
contig <- sapply(split(day_table$day, day_table$user_id), function(.days){
.diff <- cumsum(c(TRUE, diff(.days) != 1))
max(table(.diff))
})
这篇关于帮我用“应用"替换 for 循环功能的文章就介绍到这了,希望我们推荐的答案对大家有所帮助,也希望大家多多支持IT屋!
查看全文