远程ODBC前言 [英] RODBC & foreach

查看:108
本文介绍了远程ODBC前言的处理方法,对大家解决问题具有一定的参考价值,需要的朋友们下面随着小编来一起学习吧!

问题描述

我想使用foreach语法查询本地存储的数据库.

I would like to use a foreach syntax for querying a locally stored database.

我的代码如下:

h <- odbcDriverConnect(connection)

cl<-makeCluster(no_cores)
registerDoParallel(cl)

foreach(i = 1:dim(Table)[1], 
        .combine = rbind,
        .export = "h",
        .packages = "RODBC")  %dopar%  {
          cat(i,"\n")
          #h <- odbcDriverConnect(connection)
          sqlQuery(query)
        }

odbcCloseAll()
stopCluster(cl)

当我使用%do%而不是%dopar%时,它可以正常工作,但是当我尝试使其平行时,会出现此错误:

When I use %do% instead of %dopar% it works properly, but when I try to make it a parallel I get this error:

{中的错误:任务1失败-第一个参数不是打开的RODBC频道",另外:警告消息:在e $ fun(obj,replace(ex),parent.frame(),e $ data):已导出变量:h

Error in { : task 1 failed - "first argument is not an open RODBC channel" In addition: Warning message: In e$fun(obj, substitute(ex), parent.frame(), e$data) : already exporting variable(s): h

当我将h放置在foreach循环中时,它可以工作,但是我知道在每个步骤上都创建连接,所以我想避免这种情况.有什么方法可以将该连接导出到foreach(与其他变量进行连接的方式相同)?

When I place h inside the foreach loop it works, however I understand that create connection on every single step and I would like to avoid that. Is there any way I could export that connection to foreach (the same way it is made with other variables)?

推荐答案

您无法将数据库连接导出到工作程序,因为它们无法正确地序列化和反序列化(因为它们包含套接字连接之类的东西).相反,您需要在worker上创建它们,但是为了提高效率,应该创建一次,然后在foreach循环中重复使用它们.

You can't export database connections to the workers because they can't be properly serialized and deserialized (since they contain things like socket connections). Instead, you need to create them on the workers, but for efficiency, you should create them once and then use them repeatedly from your foreach loops.

由于您使用的是 doParallel ,因此可以使用 clusterEvalQ 初始化工作程序:

Since you're using doParallel, you can initialize the workers with clusterEvalQ:

clusterEvalQ(cl, {
  library(RODBC)
  connection <- "???"  # how are you setting this?
  h <- odbcDriverConnect(connection)
})

然后,只要您防止foreach自动将 h 导出到工作程序,就可以在foreach循环中使用 h :

You can then use h from your foreach loops as long as you prevent foreach from auto-exporting h to the workers:

foreach(i = 1:dim(Table)[1],
        .combine = rbind,
        .noexport = "h",  # make sure h is *not* exported!
        .packages = "RODBC")  %dopar%  {
  cat(i, "\n")
  sqlQuery(h, query)
}

使用 .noexport ="h" 很重要,否则,良好的 h 将被屏蔽,并且您会从RODBC中得到一个错误.

It's important to use .noexport="h", otherwise the good h will be masked and you'll get an error from RODBC.

这篇关于远程ODBC前言的文章就介绍到这了,希望我们推荐的答案对大家有所帮助,也希望大家多多支持IT屋!

查看全文
登录 关闭
扫码关注1秒登录
发送“验证码”获取 | 15天全站免登陆