sqldf中的R调用变量 [英] R call variable inside sqldf
问题描述
我需要对sqldf语句进行循环,为此,我需要在sqldf代码内调用循环变量:
I need to do a loop over sqldf statements and for that I need to call the loop variable inside the sqldf code:
我的表数据"可能是:
data <- read.table(text ="
loaddate DaysRange DaysRangeNext
1 2014-03-16 30 30
2 2014-03-16 0 0
3 2014-03-16 0 0
4 2014-03-16 60 NA
5 2014-04-16 30 30
6 2014-04-16 0 30
"
,header = TRUE)
然后我将loaddate格式化为日期:
then I format loaddate as a date:
data$loaddate<-as.Date(as.character(data$loaddate), format='%Y-%m-%d')
假设我有一个向量"loaddates":
Let's say I have a vector "loaddates":
loaddates<- unique(sort(data$loaddate))
我需要为每个加载日期运行以下代码:
And I need to run the following code for each loaddate:
for (i in loaddates) {
sqldf("
SELECT D.LoadDate,D.DaysRange, D.DaysRangeNext,
COUNT(*) AS clientes
FROM data AS D
WHERE D.loaddate = i
GROUP BY D.LoadDate,D.DaysRange, D.DaysRangeNext
ORDER BY D.DaysRange, D.DaysRangeNext
") }
但是我收到以下错误:
sqliteSendQuery(con,statement,bind.data)中的错误:中的错误 声明:无此列:i
Error in sqliteSendQuery(con, statement, bind.data) : error in statement: no such column: i
有没有办法保留变量值并在循环中使用它?
Is there a way to keep the variable value and use it inside the loop?
谢谢.
版本:
我尝试过:
sqldf(
strwrap(sprintf("
SELECT D.LoadDate,D.DaysRange, D.DaysRangeNext,
COUNT(*) AS clientes
FROM data AS D
WHERE D.LoadDate = '%s'
GROUP BY D.LoadDate,D.DaysRange, D.DaysRangeNext
ORDER BY D.DaysRange, D.DaysRangeNext
",i),simplify=TRUE,width=1000000))
但是我得到了
> [1] loaddate DaysRange DaysRangeNext clientes <0 rows>
> (or 0-length row.names)
推荐答案
变量i
不会在查询中被替换.您需要sprintf
为其分配一个值. (我也不知道您是否需要考虑换行符,但只是为了确保在下面提供它.也许您不需要sqldf;在这种情况下,只需删除strwrap).
Variable i
will not be replaced in the query as it is. You need sprintf
to assign a value to it. (I also don't know if you need to account for break lines but just to be sure I provide it below. Maybe you don't need for sqldf; in which case just remove strwrap).
#let's assume loaddates is the following:
loaddates <- 'something'
一种获取所需查询的方法,即不使用换行符并且使用i
获取所需的loaddates值:
One way to get the query as you want, i.e. with no break lines and with i
taking the values of loaddates you need:
strwrap(sprintf("
SELECT D.LoadDate,D.DaysRange, D.DaysRangeNext,
COUNT(*) AS clientes
FROM deuda AS D
WHERE D.loaddate = '%s'
GROUP BY D.LoadDate,D.DaysRange, D.DaysRangeNext
ORDER BY D.DaysRange, D.DaysRangeNext
",i),simplify=TRUE,width=1000000)
它将输出:
[1] "SELECT D.LoadDate,D.DaysRange, D.DaysRangeNext, COUNT(*) AS clientes FROM deuda AS D WHERE D.CodEmp = 'TGG' and D.loaddate = something GROUP BY D.LoadDate,D.DaysRange, D.DaysRangeNext ORDER BY D.DaysRange, D.DaysRangeNext"
您需要在一行中没有中断线或未分配变量i
的情况.
Which is what you need in one line with no break lines or with the variable i
unassigned.
在您的循环中应该是:
for (i in loaddates) {
strwrap(sprintf("
SELECT D.LoadDate,D.DaysRange, D.DaysRangeNext,
COUNT(*) AS clientes
FROM deuda AS D
WHERE D.loaddate = '%s'
GROUP BY D.LoadDate,D.DaysRange, D.DaysRangeNext
ORDER BY D.DaysRange, D.DaysRangeNext
",i),simplify=TRUE,width=1000000)
}
使用您的数据集:
library(sqldf)
data <- read.table(text ="
loaddate DaysRange DaysRangeNext
1 2014-03-16 30 30
2 2014-03-16 0 0
3 2014-03-16 0 0
4 2014-03-16 60 NA
5 2014-04-16 30 30
6 2014-04-16 0 30
"
,header = TRUE,stringsAsFactors=F)
loaddates<- unique(sort(data$loaddate))
for (i in loaddates) {
print(sqldf(
strwrap(sprintf("
SELECT D.LoadDate,D.DaysRange, D.DaysRangeNext,
COUNT(*) AS clientes
FROM data AS D
WHERE D.loaddate = '%s'
GROUP BY D.LoadDate,D.DaysRange, D.DaysRangeNext
ORDER BY D.DaysRange, D.DaysRangeNext
",i),simplify=TRUE,width=1000000) ))
}
输出:
loaddate DaysRange DaysRangeNext clientes
1 2014-03-16 0 0 2
2 2014-03-16 30 30 1
3 2014-03-16 60 NA 1
loaddate DaysRange DaysRangeNext clientes
1 2014-04-16 0 30 1
2 2014-04-16 30 30 1
这篇关于sqldf中的R调用变量的文章就介绍到这了,希望我们推荐的答案对大家有所帮助,也希望大家多多支持IT屋!