foreach%dopar%+ RPostgreSQL [英] foreach %dopar% + RPostgreSQL

查看:129
本文介绍了foreach%dopar%+ RPostgreSQL的处理方法,对大家解决问题具有一定的参考价值,需要的朋友们下面随着小编来一起学习吧!

问题描述

我正在使用RPostgreSQL连接到本地数据库。安装在我的Linux机器上工作得很好。 R 2.11.1,Postgres 8.4。

我用多核(doMC)并行后端来玩foreach来包装一些重复的查询(数字为数千)并将结果附加到数据结构中。奇怪的是,如果我使用%do%,但切换到%dopar%时失败,除了只有一次迭代(如下所示)我想知道它是否与单个连接对象有关,所以我创建了10个连接对象,并根据我是什么,为该查询提供了一个特定的con对象,具体取决于我的模10。 2个连接对象)。这个表达式被评估为eval(expr.01),包含/是依赖于'i'的查询。



我无法理解这些错误消息。

谢谢。

Vishal Belsare



片段如下:

 > id.qed2.foreach<  -  foreach(i = 1588:1588,.inorder = FALSE)%dopar%{
+ if(i %% 2 == 0){con < - con0}
+ if(i %% 2 == 1){con < - con1};
+ fetch(dbSendQuery(con,eval(expr.01)),n = -1)$ idreuters};
> id.qed2.foreach
[[1]]
[1] 411 414 2140 2406 4490 4507 4519 4570 4571 4572 4703 4731
[109] 48765 84312 91797

> id.qed2.foreach< - foreach(i = 1588:1589,.inorder = FALSE)%dopar%{
+ if(i %% 2 == 0){con < - con0};
+ if(i %% 2 == 1){con < - con1};
+ fetch(dbSendQuery(con,eval(expr.01)),n = -1)$ idreuters};
停止错误(paste(expired,class(con))):
没有函数返回,跳转到顶层
停止错误(paste(expired,class (con))):
没有函数返回,跳转到顶层
在{:
任务1失败 - 在评估参数'res'时出错函数'fetch'
>

编辑:我改变了一些东西,(仍然不成功),但是有一些东西变成了光。在循环中建立的连接对象并不是通过dbDisconnect断开连接,导致挂起的连接,如Postgres的/ var / log所示。

 >当我执行此操作时,会显示一些新的错误消息: system.time(
+ id.qed2.foreach< - foreach(i = 1588:1590,.inorder = FALSE,
.packages = c(DBI,RPostgreSQL))%dopar %{drv0 < - dbDriver(PostgreSQL);
con0 < - dbConnect(drv0,dbname ='nseindia');
list(idreuters = fetch(dbSendQuery(con0,eval(expr。 01)),n = -1)$ idreuters);
dbDisconnect(con0)})
postgresqlExecStatement(conn,statement,...)错误:
没有函数返回,跳到最高级别
postgresqlExecStatement(conn,statement,...)中的错误:
没有函数返回,跳转到顶层
postgresqlExecStatement错误(conn,statement,... ):
没有函数返回,跳转到顶层
在{:
任务1失败 - 评估参数'res'中的错误选择函数'fetch'


解决方案

在一个顺序的形式。下一步,我想知道是否可以将连接对象附加到registerDoMC产生的每个工作人员。如果是这样的话,那么就不需要创建/销毁连接对象,这样可以防止连接PostgreSQL服务器。 > pgparquery< - function(i){
drv< - dbDriver(PostgreSQL);
con< - dbConnect(drv,dbname ='nsdq');
lst < - eval(expr.01); #包含依赖于'i'的SQL查询
qry< - dbSendQuery(con,lst);
tmp< - fetch(qry,n = -1);
dt< - dates.qed2 [i]
dbDisconnect(con);
result< - list(date = dt,idreuters = tmp $ idreuters)
return(result)}

id.qed.foreach< - foreach(i = 1588 :3638,.inorder = FALSE,.packages = c(DBI,RPostgreSQL))%dopar%{pgparquery(i)}

-

Vishal Belsare


I am using RPostgreSQL to connect to a local database. The setup works just fine on my Linux machine. R 2.11.1, Postgres 8.4.

I was playing with the 'foreach' with the multicore (doMC) parallel backend to wrap some repetitive queries (numbering a few thousand) and appending the results into a data structure. Curiously enough, it works if I use %do% but fails when I switch to %dopar%, with the exception when there is only one iteration (as shown below)

I wondered whether it had something to do with a single connection object, so I created 10 connection objects and depending on what 'i' was, a certain con object was given for that query, depending on i modulo 10. (indicated below by just 2 connection objects). The expression which is evaluated eval(expr.01), contains/is the query which depends on what 'i' is.

I can't make sense of these particular error messages. I am wondering whether there is any way to make this work.

Thanks.
Vishal Belsare

R snippet follows:

> id.qed2.foreach <- foreach(i = 1588:1588, .inorder=FALSE) %dopar% { 
+ if (i %% 2 == 0) {con <- con0}; 
+ if (i %% 2 == 1) {con <- con1}; 
+ fetch(dbSendQuery(con,eval(expr.01)),n=-1)$idreuters};
> id.qed2.foreach
[[1]]
  [1]   411   414  2140  2406  4490  4507  4519  4570  4571  4572  4703  4731
[109] 48765 84312 91797

> id.qed2.foreach <- foreach(i = 1588:1589, .inorder=FALSE) %dopar% { 
+ if (i %% 2 == 0) {con <- con0}; 
+ if (i %% 2 == 1) {con <- con1}; 
+ fetch(dbSendQuery(con,eval(expr.01)),n=-1)$idreuters};
Error in stop(paste("expired", class(con))) : 
  no function to return from, jumping to top level
Error in stop(paste("expired", class(con))) : 
  no function to return from, jumping to top level
Error in { : 
  task 1 failed - "error in evaluating the argument 'res' in selecting a method for function 'fetch'"
> 

EDIT: I changed a few things, (still unsuccessful), but a few things come to light. Connection objects made in the loop and not 'disconnected' via dbDisconnect, lead to hanging connections as evident by the /var/log for Postgres. A few new error messages show up when I do this:

> system.time(
+ id.qed2.foreach <- foreach(i = 1588:1590, .inorder=FALSE, 
.packages=c("DBI", "RPostgreSQL")) %dopar% {drv0 <- dbDriver("PostgreSQL"); 
con0 <- dbConnect(drv0, dbname='nseindia');
list(idreuters=fetch(dbSendQuery(con0,eval(expr.01)),n=-1)$idreuters);
dbDisconnect(con0)})
Error in postgresqlExecStatement(conn, statement, ...) : 
  no function to return from, jumping to top level
Error in postgresqlExecStatement(conn, statement, ...) : 
  no function to return from, jumping to top level
Error in postgresqlExecStatement(conn, statement, ...) : 
  no function to return from, jumping to top level
Error in { : 
  task 1 failed - "error in evaluating the argument 'res' in selecting a method for function 'fetch'"

解决方案

The following works and speeds up by ~ 1.5x over a sequential form. As a next step, I am wondering whether it is possible to attach a connection object to each of the workers spawned by registerDoMC. If so, then there would be no need to create/destroy the connection objects, which prevents from overwhelming the PostgreSQL server with connections.

pgparquery <- function(i) {
drv <- dbDriver("PostgreSQL"); 
con <- dbConnect(drv, dbname='nsdq'); 
lst <- eval(expr.01); #contains the SQL query which depends on 'i'
qry <- dbSendQuery(con,lst);
tmp <- fetch(qry,n=-1);
dt <- dates.qed2[i]
dbDisconnect(con);
result <- list(date=dt, idreuters=tmp$idreuters)
return(result)}

id.qed.foreach <- foreach(i = 1588:3638, .inorder=FALSE, .packages=c("DBI", "RPostgreSQL")) %dopar% {pgparquery(i)}

--
Vishal Belsare

这篇关于foreach%dopar%+ RPostgreSQL的文章就介绍到这了,希望我们推荐的答案对大家有所帮助,也希望大家多多支持IT屋!

查看全文
登录 关闭
扫码关注1秒登录
发送“验证码”获取 | 15天全站免登陆