R sqldf在日期范围条件中没有选择性 [英] R sqldf not being selective in date range criteria
问题描述
我正在尝试选择日期值小于某个值的行.由于我正在获取所有日期值,而不仅仅是小于该值的日期,所以似乎不起作用.
I am trying to select rows with a date value less than a value. It doesn't seem to be working as I am getting all date values, not just those less than a value.
这是df结构:
str(sawdf)
'data.frame': 83597 obs. of 10 variables:
$ actiondate : Date, format: "2016-05-08" "2016-05-08" "2016-05-09" ...
这是一些示例数据:
head(sawdf)
actiondate
2016-05-14
2016-05-15
2016-05-16
2016-05-17
2016-05-18
这是我的sql:
sqldf("select distinct actiondate from sawdf where actiondate < '2016-05-18'")
这是一些结果:
...
6 2016-05-13
7 2016-05-14
8 2016-05-15
9 2016-05-16
10 2016-05-17
11 2016-05-18
12 2016-05-19
如您所见,正在选择2016-05-18之后的数据.
As you can see data beyond 2016-05-18 are being selected.
我尝试了几种方法,但是得到了相同的结果.
I've tried several approaches but am getting the same results.
谢谢
推荐答案
1)sqlite 假设您正在使用默认的SQLite后端,则SQLite没有日期类型,因此日期已传输到SQLite.自UNIX时代以来的天数.在SQLite方面, actiondate
是一列数字.(如果 x
是"Date"
类R变量,则 as.numeric(x)
给出要传输的数字(SQLite).我们需要将这些数字与适当的数字进行比较,而不是与字符串进行比较.之所以可行,是因为它也以相同的方式转换比较日期(即,将 $ date0
替换为16939,这是自该日期表示的UNIX纪元以来的天数):
1) sqlite Assuming you are using the default SQLite backend, SQLite does not have a date type so the dates are transferred to SQLite as the number of days since the UNIX Epoch. That is on the SQLite side actiondate
is a column of numbers. (If x
were a "Date"
class R variable then as.numeric(x)
gives the number(s) that is/are transferred to SQLite.) We need to compare these numbers to an appropriate number, not to a character string. This would work as it also converts the comparison date in the same way (i.e. it replaces $date0
with 16939 which is the number of days since the UNIX Epoch represented by that date):
library(sqldf)
date0 <- as.Date("2016-05-18")
fn$sqldf("select distinct actiondate from sawdf where actiondate < $date0")
在github上的sqldf主页上有关于使用SQLite在sqldf中进行日期处理的更多信息: https://github.com/ggrothendieck/sqldf
There is more information on date processing in sqldf with SQLite on the sqldf home page on github: https://github.com/ggrothendieck/sqldf
1a)这也将起作用,因为所有日期都以相同的方式转移:
1a) This would also work since all dates get transferred in the same way:
library(sqldf)
Date0 <- data.frame(date0 = as.Date("2016-05-18"))
sqldf("select distinct actiondate from sawdf where actiondate < (select date0 from Date0)")
1b)尽管有点麻烦,但可以使用SQLite内置函数将比较日期转换为数字,而不是将比较日期转换为数字:
1b) Although it is a bit messy, rather than convert the comparison date to numeric one could convert the actiondate column to character using an SQLite builtin function:
sqldf("select distinct actiondate from sawdf
where strftime('%Y-%m-%d', actiondate * 3600 * 24, 'unixepoch') < '2016-05-18'")
2)H2 或者使用具有日期类型的H2后端.在这种情况下,问题中的代码确实起作用.安装RH2(包括H2),并确保在计算机上安装了Java.然后:
2) H2 Alternately use the H2 backend which does have a date type. In that case the code in the question does work. Install RH2 (which includes H2) and also make sure you have java installed on your machine. Then:
library(RH2)
library(sqldf)
sqldf("select distinct actiondate from sawdf where actiondate < '2016-05-18'")
注意:我们假设可重复的输入形式是:
Note: The input we assumed, in reproducible form, is:
Lines <- "actiondate
2016-05-14
2016-05-15
2016-05-16
2016-05-17
2016-05-18"
sawdf <- read.csv(text = Lines)
sawdf$actiondate <- as.Date(sawdf$actiondate)
这篇关于R sqldf在日期范围条件中没有选择性的文章就介绍到这了,希望我们推荐的答案对大家有所帮助,也希望大家多多支持IT屋!