data.table无法识别过滤器中的逻辑 [英] data.table not recognising logical in filter

查看:82
本文介绍了data.table无法识别过滤器中的逻辑的处理方法,对大家解决问题具有一定的参考价值,需要的朋友们下面随着小编来一起学习吧!

问题描述

在以下代码段中,data.table在i中使用时似乎不识别逻辑。



我在一个最小示例中重现问题的所有尝试都失败了,这就是为什么我在这里发布完整的部分。我期望它与部分as.logical(cumsum(CURRENT_TRIP))相关,但只是一个直觉... ...

 #Testdata 
timetable< - data.table(rbind(
c(r1,t1_1,p1,10,10),
c(r1 t1_1,p2,11,11),
c(r1,t1_1,p3,12,12),
c(r1,t1_1,p4 ,13,13),
c(r1,t1_1,p5,14,14),
c(r1,t1_1,p6,15,15) ,
c(r1,t1_1,p7,16,16),
c(r1,t1_1,p8,17,17),
c r1,t1_1,p9,18,18),
c(r1,t1_1,p10,19,19),

c ,t2,p11,9,9),
c(r2,t2,p12,10,10),
c p3,11,11),
c(r2,t2,p13,12,12),
c(r2,t2 13),
c(r2,t2,p15,14,14),
c(r2,t2,p16,15,15) bc(r2,t2,p17,16,16),
c(r2,t2,p18,17,17)

timetable [,':='(ARRIVAL = as.integer(ARRIVAL),DEPARTURE =); setnames(timetable,c(ROUTE,TRIP,STOP,ARRIVAL,DEPARTURE as.integer(DEPARTURE))]


#输入
startStation< - p3
startTime < - 8

setorder(timetable,TRIP,ARRIVAL)
timetable [,ID:= .I]

timetable [,':='(ARR_ROUND_PREV = Inf,ARR_ROUND = Inf,ARR_BEST = Inf,MARKED = F,CURRENT_TRIP = F)]
timetable [STOP == startStation,':='(ARR_ROUND_PREV = startTime,ARR_ROUND = startTime,ARR_BEST = startTime,MARKED = T)]

< - 时间表[MARKED == T,unique(ROUTE)]
ID< - timetable [MARKED == T& DEPARTURE> ARR_ROUND,。(ID = ID [DEPARTURE == min(DEPARTURE)]),by = ROUTE] [,ID]

timetable [ID%in%ids,CURRENT_TRIP:= T]
时间表[,MARKED:= F]

旅行< - 时间表[CURRENT_TRIP == T,独特(TRIP)]
时间表[TRIP%,百分比,CURRENT_TRIP:= as。 logical(cumsum(CURRENT_TRIP)),by = TRIP]

#?
timetable
nrow(timetable [CURRENT_TRIP == T])#8
sum(timetable $ CURRENT_TRIP == T)#15

#but
nrow(timetable [CURRENT_TRIP> 0])#15
nrow(timetable [CURRENT_TRIP == 1L])#15

任何想法?



问题出现在使用最新的1.9.7和1.9.6和R 3.2.3在Win 64位



Fab

解决方案

p>

数据的奇怪问题。表行搜索



我也无法使用最小代码重现它!



我对您的代码的解决方案正在改变您设置列CURRENT_TRIP的方式。

 时间表[ID%in%ids] $ CURRENT_TRIP<  -  T 
timetable [,MARKED:= F]

trips< - 时间表[CURRENT_TRIP == T,独特(TRIP)]
时间表[TRIP%在%trips] $ CURRENT_TRIP< - 时间表[,asslogical(cumsum CURRENT_TRIP)),by = TRIP] $ V1

#?
timetable
nrow(timetable [CURRENT_TRIP == T])#8
sum(timetable $ CURRENT_TRIP == T)#15

#but
nrow(timetable [CURRENT_TRIP> 0])#15
nrow(timetable [CURRENT_TRIP == 1L])#15

使用dT [,Column:= T]符号设置列也导致我同样的问题!我不知道为什么,我与data.tables的创建者联系来解决这个问题!


in the following snippet, data.table does not seem to recognize logicals when used in i.

All my attempts to reproduce the problem in a minimal example failed, that's why I am posting the complete section here. I expect it to be related to the part "as.logical(cumsum(CURRENT_TRIP))", but just a gut feeling...

# Testdata
timetable <- data.table(rbind(
    c("r1", "t1_1", "p1", 10, 10),
    c("r1", "t1_1", "p2", 11, 11),
    c("r1", "t1_1", "p3", 12, 12),
    c("r1", "t1_1", "p4", 13, 13),
    c("r1", "t1_1", "p5", 14, 14),
    c("r1", "t1_1", "p6", 15, 15),
    c("r1", "t1_1", "p7", 16, 16),
    c("r1", "t1_1", "p8", 17, 17),
    c("r1", "t1_1", "p9", 18, 18),
    c("r1", "t1_1", "p10", 19, 19),

    c("r2", "t2", "p11", 9, 9),
    c("r2", "t2", "p12", 10, 10),
    c("r2", "t2", "p3", 11, 11),
    c("r2", "t2", "p13", 12, 12),
    c("r2", "t2", "p14", 13, 13),
    c("r2", "t2", "p15", 14, 14),
    c("r2", "t2", "p16", 15, 15),
    c("r2", "t2", "p17", 16, 16),
    c("r2", "t2", "p18", 17, 17)
  ))
setnames(timetable, c("ROUTE", "TRIP", "STOP", "ARRIVAL", "DEPARTURE"))
timetable[, ':='(ARRIVAL = as.integer(ARRIVAL), DEPARTURE = as.integer(DEPARTURE))]


# Input
startStation <- "p3"
startTime <- 8

setorder(timetable, TRIP, ARRIVAL)
timetable[, ID := .I]

timetable[,':='(ARR_ROUND_PREV = Inf, ARR_ROUND = Inf, ARR_BEST = Inf, MARKED = F, CURRENT_TRIP = F)]
timetable[STOP == startStation, ':='(ARR_ROUND_PREV = startTime, ARR_ROUND = startTime, ARR_BEST = startTime, MARKED = T)]

routes <- timetable[MARKED == T, unique(ROUTE)] 
ids <- timetable[MARKED == T & DEPARTURE > ARR_ROUND, .(ID = ID[DEPARTURE == min(DEPARTURE)]), by = ROUTE][, ID]

timetable[ID %in% ids, CURRENT_TRIP := T]
timetable[, MARKED := F]

trips <- timetable[CURRENT_TRIP == T, unique(TRIP)]
timetable[TRIP %in% trips, CURRENT_TRIP := as.logical(cumsum(CURRENT_TRIP)), by = TRIP]

# ?
timetable
nrow(timetable[CURRENT_TRIP == T]) #8
sum(timetable$CURRENT_TRIP == T) #15

# but 
nrow(timetable[CURRENT_TRIP > 0]) #15
nrow(timetable[CURRENT_TRIP == 1L]) #15

any ideas?

Problem shows up using newest 1.9.7 and 1.9.6 and R 3.2.3 on Win 64bit

Fab

解决方案

You have exactly the same bug that I have!!!

Strange issue with data.table row search

I also could not reproduce it with a minimal code!

My solution to your code is changing how you set the column CURRENT_TRIP.

timetable[ID %in% ids]$CURRENT_TRIP <- T
timetable[, MARKED := F]

trips <- timetable[CURRENT_TRIP == T, unique(TRIP)]
timetable[TRIP %in% trips]$CURRENT_TRIP <- timetable[,as.logical(cumsum(CURRENT_TRIP)), by = TRIP]$V1

# ?
timetable
nrow(timetable[CURRENT_TRIP == T]) #8
sum(timetable$CURRENT_TRIP == T) #15

# but 
nrow(timetable[CURRENT_TRIP > 0]) #15
nrow(timetable[CURRENT_TRIP == 1L]) #15

Using the dT[,Column:=T] notation for setting up columns also caused me the same issue! I am not sure why and I am in touch with the creator of data.tables to fix this!

这篇关于data.table无法识别过滤器中的逻辑的文章就介绍到这了,希望我们推荐的答案对大家有所帮助,也希望大家多多支持IT屋!

查看全文
登录 关闭
扫码关注1秒登录
发送“验证码”获取 | 15天全站免登陆