尝试以交互方式加载由暂停的批处理脚本保存的数据文件时出错 [英] Error when trying to interactively load data file saved by paused batch script

查看:133
本文介绍了尝试以交互方式加载由暂停的批处理脚本保存的数据文件时出错的处理方法,对大家解决问题具有一定的参考价值,需要的朋友们下面随着小编来一起学习吧!

问题描述

在调试和解决我的问题时需要检索属性(

In the process of debugging and solving my problem with retrieving attributes (Can I access R data objects' attributes without fully loading objects from file?), based on advice here on SO, I switched from using save() and load() to saveRDS() and readRDS(), correspondingly.

我的调查(通过非交互式调试打印)显示了以下内容:

My investigation (via non-interactive debug printing) showed the following:

  1. 在初始saveRDS()之后立即保存的对象包含所讨论的属性;

  1. immediately after initial saveRDS() the saved object contains the attribute in question;

在脚本的首次运行后执行的交互式R会话,显示保存的对象中属性的缺少

an interactive R session, performed after the initial run of the script, show the absence of the attribute from the saved object;

上面的先前发现解释了在脚本的下一次运行期间无法检索到所述属性的原因,我最初将其错误地归因于save/loadsaveRDS/readRDS行为.

the previous findings above explain the failure to retrieve the said attribute during the next run of the script, which I initially incorrectly attributed to save/load and saveRDS/readRDS behavior.

为了在初始saveRDS之后立即手动确认持久性对象(保存在.rds文件中)中属性的存在,我决定暂停该批处理R脚本使用scan在一个终端窗口中运行(readLine在批处理R脚本中似乎不适用于该脚本):

In order to manually confirm the presence of the attribute in the persistent object (saved in an .rds file) immediately after the initial saveRDS, I decided to pause the batch R script running in one terminal window using scan (readLine doesn't appear to work for this in batch R scripts):

if (DEBUG) {
  cat("Press [Enter] to continue")
  key <- scan("stdin", character(), n=1)
}

,然后在另一个终端窗口中,通过交互式R会话检查保存的对象.

and, in another terminal window, to inspect the saved object via an interactive R session.

但是,在批处理脚本按预期停止后,何时在交互式会话中从.rds文件加载保存的对象 失败,并显示以下消息:

However, when, after the batch script has stopped as expected, loading the saved object from the .rds file in an interactive session failed with the following message:

> load("../cache/SourceForge/ZGV2TGlua3M=.rds")
Error: bad restore file magic number (file may be corrupted) -- no data loaded
In addition: Warning message:
file ‘ZGV2TGlua3M=.rds’ has magic number 'X'
  Use of save versions prior to 2 is deprecated

以下输出描述了我在调查时的 R环境:

The following output describes my R environment at the time of investigation:

> sessionInfo()
R version 3.0.2 (2013-09-25)
Platform: x86_64-pc-linux-gnu (64-bit)

locale:
 [1] LC_CTYPE=en_US.UTF-8       LC_NUMERIC=C
 [3] LC_TIME=en_US.UTF-8        LC_COLLATE=en_US.UTF-8
 [5] LC_MONETARY=en_US.UTF-8    LC_MESSAGES=en_US.UTF-8
 [7] LC_PAPER=en_US.UTF-8       LC_NAME=C
 [9] LC_ADDRESS=C               LC_TELEPHONE=C
[11] LC_MEASUREMENT=en_US.UTF-8 LC_IDENTIFICATION=C

attached base packages:
[1] stats     graphics  grDevices utils     datasets  methods   base

对我而言,唯一可能的解释是批处理会话(特别是通过scan的暂停)以某种方式锁定或修改了环境,从而使得无法从内部正确访问R对象互动环节.可能存在这种情况的其他可能原因. 非常感谢您提供解决此问题的帮助或建议!

The only plausible to me explanation is that the batch session (and, specifically, the pause via scan) somehow locks or modifies the environment that makes it impossible to properly access R objects from within the interactive session. Perhaps there exist other possible reasons for this situation. I would greatly appreciate any help or advice to solve this problem!

更新:

在终止了批处理R脚本的进程(在scan变得无响应之后)之后,我再次尝试手动加载.rds文件,由于在批处理脚本中没有暂停,因此期望成功.但是,令我惊讶的是,我收到了完全相同的错误消息.这使我认为.rds文件确实已损坏(可能是由于我反复按下Ctrl-C来停止正在运行的批处理R脚本的做法-我将需要提供更多温和"的东西).在找出停止运行脚本的更好方法之后,我将尝试重现该场景并在此处报告.

After killing the batch R script's process (which after scan became unresponsive), I again tried to manually load the .rds file, expecting a success due to the absence of the pause in the batch script. However, to my surprise, I was greeted with the exact same error message. This makes me think that the .rds file is really corrupted (potentially due to my practice of stopping a running batch R script by repeatedly pressing Ctrl-C - I will need to come up with something more "gentle"). After figuring out a better way to stop a running script, I will try to reproduce the scenario and report here.

更新2:

从缓存目录中删除所有(可能已损坏的).rds文件,并按照上述方案(在批处理R脚本暂停的情况下以交互方式加载R数据文件)之后,输出会显示完全相同的错误消息和以前一样.在这一点上,我真的需要一个建议来弄清楚发生了什么.

After removing all (potentially corrupted) .rds files from the cache directory and following the scenario described above (loading R data file interactively with batch R script paused), the output presented exactly the same error message as before. At this point, I really need an advice to figure out what's going on.

UPADATE 3(保存对象):

UPADATE 3 (saving the object):

assign(dataName, srdaGetData())
data <- as.name(dataName)

# save hash of the request's SQL query as data object's attribute,
# so that we can detect when configuration contains modified query
attr(data, "SQL") <- base64(request)

# save current data frame to RDS file
saveRDS(data, rdataFile)

更新4(可重现的示例):

UPDATE 4 (reproducible example):

library(RCurl)

info <- "Important data"
request <- "SELECT info FROM topSecret"
dataName <- "sf.data.devLinks"
rdataFile <- "/tmp/testAttr.rds"

getData <- function() {
  return (info)
}

requestDigest <- base64(request)

# check if the archive file has already been processed
message("\nProcessing request \"", request, "\" ...\n")

# read back the object with the attribute
if (file.exists(rdataFile)) {
  # now check if request's SQL query hasn't been modified
  data <- readRDS(rdataFile)
  message("Retrieved object '", as.name(data), "', containing:\n")
  message(toString(data))

  requestAttrib <- attr(data, "SQL", exact = TRUE)
  message("\nObject '", data, "' contains attribute:\n\"",
                 base64(requestAttrib), "\"\n")

  if (identical(requestDigest, requestAttrib)) {
    message("Processing skipped: RDS file is up-to-date.\n")
    stop()
  }
  rm(data)
}

message("Saving results of request \"",
        request, "\" as R data object ...\n")

assign(dataName, getData())
data <- as.name(dataName)

# save hash of the request's SQL query as data object's attribute,
# so that we can detect when configuration contains modified query
attr(data, "SQL") <- base64(request)

# save current data frame to RDS file
saveRDS(data, rdataFile)

我希望保存dataName变量的,但是代码会保存变量的名称.

I expect the value of dataName variable to be saved, however the code saves the name of the variable.

推荐答案

如果使用saveRDS保存内容,则等效的loading函数为readRDS/ 如果将对象save放入RData文件,则应使用load加载对象.

If you save something using saveRDS, the equivalent loading function is readRDS/ If you save an object into an RData file, you should use load to load the object.

readRDS将允许您指定要加载的对象的名称.

readRDS will allow you to specify the name of the object being loaded.

loadobjects加载到.RData文件中,并且它们将保留其保存名称.

load loads the objects in an .RData file, and they will retain the names with which they were saved.

如果"../cache/SourceForge/ZGV2TGlua3M=.rds"是使用saveRDS保存的,则

whatever <- readRDS("../cache/SourceForge/ZGV2TGlua3M=.rds")

会将对象加载为whatever

在未保存为.RData格式的文件上运行load会导致您发布错误消息.

Running load on a file not saved in .RData format will result in the error message you posted.

这篇关于尝试以交互方式加载由暂停的批处理脚本保存的数据文件时出错的文章就介绍到这了,希望我们推荐的答案对大家有所帮助,也希望大家多多支持IT屋!

查看全文
登录 关闭
扫码关注1秒登录
发送“验证码”获取 | 15天全站免登陆