锁定或保护R中的数据表 [英] Lock or protect a data.table in R

查看:53
本文介绍了锁定或保护R中的数据表的处理方法,对大家解决问题具有一定的参考价值,需要的朋友们下面随着小编来一起学习吧!

问题描述

是否存在一种或多种锁定或保护 data.table 的方法,使其无法就地修改?

Are there one or more ways to lock or protect a data.table such that it can no longer be modified in-place?

假设我们有一个data.table:

Say we have a data.table:

dt <- data.table(id = 1, val="foo")
dt
#    id val
# 1:  1 foo

然后可以修改 dt 以获得以下行为吗?

Can I then modify dt to get the following behavior after?

dt[, val:="bar"]
# error or warning
dt
#    id val
# 1:  1 foo  ## unmodified



上下文



这是因为我写了一篇工作中使用的小型R包,广泛使用 data.table 。它有一些 data.tables (翻译表),如果用户不小心对其进行了修改,则会引起问题(翻译不正确)。我曾希望将数据设为内部(定义为此处)可以解决

Context

This came up because I author a small R package at work that uses data.table extensively. It has some data.tables in it (translation tables) which, if accidentally modified by a user, would cause issues (improper translations). I had hoped that making the data "internal" (as defined here) would solve this but it does not.

因为这只是 data.table 对象的问题,我可以使用data.frames,根据需要在函数中复制并强制转换为 data.table 。如果需要,我会走这条路线(我的表很小,不会注意到时间/内存开销),但我希望有一个更自然的解决方案。

Because this is only an issue with data.table objects, I could just use data.frames, copying + casting to data.table as needed within functions. I will go this route if needed (my tables are small enough that the time/memory overhead won't be noticed), but I'm hopeful there's a more natural solution.

推荐答案

以下是一些可能的想法。

Here are a couple of possible ideas.

您可以编写自己的包装对象(可能使用R6包),定义所有编辑工具以给出错误且不更改基础的data.table,但使用标准的data.table访问功能仅读取对象。

You could write your own wrapper object (possibly use the R6 package) that defines all the editing tools to give the error and not change the underlying data.table, but uses the standard data.table access functionality for just reading the object.

您可以遵循TeachingDemos软件包中 petals 函数的方法。

You could follow the approach of the petals function in the TeachingDemos package.

以上两种方法都不是完美无瑕的人们仍然可以更改它们。它们可能也不值得进行所需的工作。

Both of the above are not perfect and a determined person could still change them. They are probably also not worth the work needed.

您可以在每次函数运行时重新读取表,这样就需要在磁盘上进行更改,而不仅仅是在R中。

You could reread your tables each time your function runs, so that changes would need to be made on the disk, not just in R.

有一些工具/程序包可以计算MD5sums之类的东西,因此您可以为data.table计算它,然后在代码运行时可以检查MD5sum并在更改后停止。

There are tools/packages to compute things like the MD5sums, so you could calculate that for your data.table, then when the code runs you could check the MD5sum and stop if it has changed.

您可以将data.tables保存在.Rdata样式文件中,并将该文件附加到搜索路径上,而不是将其加载到工作目录。它仍然可以更改,但偶然发生的可能性较小,需要更多的更改(确保您的代码不在全局环境中访问本地副本(使用 get :: 或检查是否没有本地副本))。

You can have the data.tables saved in a .Rdata style file and attach the file onto the search path rather than load it into the working directory. It could still be changed, but less likely to happen by chance and would require more effort to change (make sure that your code does not access local copies in the global environment (use get or :: or check that a local copy does not exist)).

这篇关于锁定或保护R中的数据表的文章就介绍到这了,希望我们推荐的答案对大家有所帮助,也希望大家多多支持IT屋!

查看全文
登录 关闭
扫码关注1秒登录
发送“验证码”获取 | 15天全站免登陆