sqldf中的R调用变量 [英] R call variable inside sqldf

查看:224
本文介绍了sqldf中的R调用变量的处理方法,对大家解决问题具有一定的参考价值,需要的朋友们下面随着小编来一起学习吧!

问题描述

我需要对sqldf语句进行循环,为此,我需要在sqldf代码内调用循环变量:

I need to do a loop over sqldf statements and for that I need to call the loop variable inside the sqldf code:

我的表数据"可能是:

data <- read.table(text ="
    loaddate DaysRange DaysRangeNext
1 2014-03-16        30            30
2 2014-03-16         0             0
3 2014-03-16         0             0
4 2014-03-16        60            NA
5 2014-04-16        30            30
6 2014-04-16         0            30
"
,header = TRUE)

然后我将loaddate格式化为日期:

then I format loaddate as a date:

data$loaddate<-as.Date(as.character(data$loaddate), format='%Y-%m-%d')

假设我有一个向量"loaddates":

Let's say I have a vector "loaddates":

loaddates<- unique(sort(data$loaddate))

我需要为每个加载日期运行以下代码:

And I need to run the following code for each loaddate:

for (i in loaddates) {

sqldf("
SELECT D.LoadDate,D.DaysRange, D.DaysRangeNext, 
COUNT(*) AS clientes
FROM data AS D
WHERE D.loaddate = i
GROUP BY D.LoadDate,D.DaysRange, D.DaysRangeNext
ORDER BY D.DaysRange, D.DaysRangeNext
")        }

但是我收到以下错误:

sqliteSendQuery(con,statement,bind.data)中的错误:中的错误 声明:无此列:i

Error in sqliteSendQuery(con, statement, bind.data) : error in statement: no such column: i

有没有办法保留变量值并在循环中使用它?

Is there a way to keep the variable value and use it inside the loop?

谢谢.

版本:

我尝试过:

sqldf(
strwrap(sprintf("
SELECT D.LoadDate,D.DaysRange, D.DaysRangeNext, 
COUNT(*) AS clientes 
FROM data AS D
WHERE D.LoadDate = '%s'
GROUP BY D.LoadDate,D.DaysRange, D.DaysRangeNext
ORDER BY D.DaysRange, D.DaysRangeNext
",i),simplify=TRUE,width=1000000))

但是我得到了

> [1] loaddate      DaysRange     DaysRangeNext clientes      <0 rows>
> (or 0-length row.names)

推荐答案

变量i不会在查询中被替换.您需要sprintf为其分配一个值. (我也不知道您是否需要考虑换行符,但只是为了确保在下面提供它.也许您不需要sqldf;在这种情况下,只需删除strwrap).

Variable i will not be replaced in the query as it is. You need sprintf to assign a value to it. (I also don't know if you need to account for break lines but just to be sure I provide it below. Maybe you don't need for sqldf; in which case just remove strwrap).

#let's assume loaddates is the following:
loaddates <- 'something'

一种获取所需查询的方法,即不使用换行符并且使用i获取所需的loaddates值:

One way to get the query as you want, i.e. with no break lines and with i taking the values of loaddates you need:

strwrap(sprintf("
                SELECT D.LoadDate,D.DaysRange, D.DaysRangeNext, 
                COUNT(*) AS clientes
                FROM deuda AS D
                WHERE D.loaddate = '%s'
                GROUP BY D.LoadDate,D.DaysRange, D.DaysRangeNext
                ORDER BY D.DaysRange, D.DaysRangeNext
                ",i),simplify=TRUE,width=1000000)

它将输出:

[1] "SELECT D.LoadDate,D.DaysRange, D.DaysRangeNext, COUNT(*) AS clientes FROM deuda AS D WHERE D.CodEmp = 'TGG' and D.loaddate = something GROUP BY D.LoadDate,D.DaysRange, D.DaysRangeNext ORDER BY D.DaysRange, D.DaysRangeNext"

您需要在一行中没有中断线或未分配变量i的情况.

Which is what you need in one line with no break lines or with the variable i unassigned.

在您的循环中应该是:

for (i in loaddates) {

strwrap(sprintf("
                SELECT D.LoadDate,D.DaysRange, D.DaysRangeNext, 
                COUNT(*) AS clientes
                FROM deuda AS D
                WHERE D.loaddate = '%s'
                GROUP BY D.LoadDate,D.DaysRange, D.DaysRangeNext
                ORDER BY D.DaysRange, D.DaysRangeNext
                ",i),simplify=TRUE,width=1000000)

}

使用您的数据集:

library(sqldf)
data <- read.table(text ="
    loaddate DaysRange DaysRangeNext
1 2014-03-16        30            30
2 2014-03-16         0             0
3 2014-03-16         0             0
4 2014-03-16        60            NA
5 2014-04-16        30            30
6 2014-04-16         0            30
"
                   ,header = TRUE,stringsAsFactors=F)

loaddates<- unique(sort(data$loaddate))

for (i in loaddates) {

  print(sqldf(
  strwrap(sprintf("
                SELECT D.LoadDate,D.DaysRange, D.DaysRangeNext, 
                COUNT(*) AS clientes
                FROM data AS D
                WHERE D.loaddate = '%s'
                GROUP BY D.LoadDate,D.DaysRange, D.DaysRangeNext
                ORDER BY D.DaysRange, D.DaysRangeNext
                ",i),simplify=TRUE,width=1000000) ))
}

输出:

    loaddate DaysRange DaysRangeNext clientes
1 2014-03-16         0             0        2
2 2014-03-16        30            30        1
3 2014-03-16        60            NA        1
    loaddate DaysRange DaysRangeNext clientes
1 2014-04-16         0            30        1
2 2014-04-16        30            30        1

这篇关于sqldf中的R调用变量的文章就介绍到这了,希望我们推荐的答案对大家有所帮助,也希望大家多多支持IT屋!

查看全文
登录 关闭
扫码关注1秒登录
发送“验证码”获取 | 15天全站免登陆