是否可以在testthat或run_examples()中使用R包数据? [英] Is it possible to use R package data in testthat tests or run_examples()?

查看:159
本文介绍了是否可以在testthat或run_examples()中使用R包数据?的处理方法,对大家解决问题具有一定的参考价值,需要的朋友们下面随着小编来一起学习吧!

问题描述

我正在使用devtools,testthat和roxygen2开发R软件包。我在数据文件夹中有几个数据集(foo.txt和bar.csv)。

I'm working on developing an R package, using devtools, testthat, and roxygen2. I have a couple of data sets in the data folder (foo.txt and bar.csv).

我的文件结构如下:

/ mypackage
    / data
        * foo.txt, bar.csv
    / inst
        / tests
            * run-all.R, test_1.R
    / man
    / R

我很确定'foo'和'bar'已正确记录:

I'm pretty sure 'foo' and 'bar' are documented correctly:

    #' Foo data
    #'
    #' Sample foo data
    #'
    #' @name foo
    #' @docType data
    NULL
    #' Bar data
    #'
    #' Sample bar data
    #'
    #' @name bar
    #' @docType data
    NULL

我想在文档示例和单元测试中使用 foo和 bar中的数据。

I would like to use the data in 'foo' and 'bar' in my documentation examples and unit tests.

例如,我想在测试中使用这些数据集,方法是调用:

For example, I would like to use these data sets in my testthat tests by calling:

    data(foo)
    data(bar)
    expect_that(foo$col[1], equals(bar$col[1]))

而且,我希望文档中的示例如下所示:

And, I would like the examples in the documentation to look like this:

    #' @examples
    #' data(foo)
    #' functionThatUsesFoo(foo)

如果在开发程序包时尝试调用data(foo),则会出现错误未找到数据集'foo'。但是,如果我构建软件包,安装并加载它-那么我可以使测试和示例正常工作。

If I try to call data(foo) while developing the package, I get the error "data set 'foo' not found". However, if I build the package, install it, and load it - then I can make the tests and examples work.

我当前的解决方法是不运行该示例:

My current work-arounds are to not run the example:

    #' @examples
    #' \dontrun{data(foo)}
    #' \dontrun{functionThatUsesFoo(foo)}

然后在测试中,使用本地计算机专用的路径预加载数据:

And in the tests, pre-load the data using a path specific to my local computer:

    foo <- read.delim(pathToFoo, sep="\t", fill = TRUE, comment.char="#")
    bar <- read.delim(pathToBar, sep=";", fill = TRUE, comment.char="#"
    expect_that(foo$col[1], equals(bar$col[1]))

这似乎并不理想-尤其是因为我正在与其他人协作-要求所有协作者具有相同的 foo和

This does not seem ideal - especially since I'm collaborating with others - requiring all the collaborators to have the same full paths to 'foo' and 'bar'. Plus, the examples in the documentation look like they can't be run, even though once the package is installed, they can.

有什么建议吗?谢谢,即使文档中的示例安装后也可以运行。

Any suggestions? Thanks much.

推荐答案

在示例/ tes中导入非RData文件ts



我通过查看 JSONIO包,显然需要提供一些读取文件的示例,而不是.RData种类的文件。

Importing non-RData files within examples/tests

I found a solution to this problem by peering at the JSONIO package, which obviously needed to provide some examples of reading files other than those of the .RData variety.

我在函数级示例中使用了它,并满足了 R CMD检查mypackage testthat :: test_package()

I got this to work in function-level examples, and satisfy both R CMD check mypackage as well as testthat::test_package().

(1)重新组织包结构,以便示例数据目录位于 inst 内。在某些时候, R CMD检查mypackage 告诉我将非RData数据文件移动到 inst / extdata ,因此在

(1) Re-organize your package structure so that example data directory is within inst. At some point R CMD check mypackage told me to move non-RData data files to inst/extdata, so in this new structure, that is also renamed.

/ mypackage
    / inst
        / tests
            * run-all.R, test_1.R
        / extdata
            * foo.txt, bar.csv
    / man
    / R
    / tests
        * run-testthat-mypackage.R

(2)(可选)添加顶级 tests 目录,以便您的新测试现在也可以在 R CMD check mypackage 期间运行测试。

(2) (Optional) Add a top-level tests directory so that your new testthat tests are now also run during R CMD check mypackage.

run-testthat-mypackage.R 脚本至少应包含以下两行:

The run-testthat-mypackage.R script should have at minimum the following two lines:

library("testthat")
test_package("mypackage")

请注意,这是允许在 R CMD检查mypackage 期间调用testthat的部分,否则没有必要。您还应该在DESCRIPTION文件中添加 testthat 作为建议:依赖项。

Note that this is the part that allows testthat to be called during R CMD check mypackage, and not necessary otherwise. You should add testthat as a "Suggests:" dependency in your DESCRIPTION file as well.

(3)最后,它是指定包内路径的秘诀:

(3) Finally, the secret-sauce for specifying your within-package path:

barfile <- system.file("extdata", "bar.csv", package="mypackage")
bar <- read.csv(barfile)
# remainder of example/test code here...

如果您查看 system.file()命令的输出,它将返回R框架中软件包的完整系统路径。在Mac OS X上看起来像这样:

If you look at the output of the system.file() command, it is returning the full system path to your package within the R framework. On Mac OS X this looks something like:

"/Library/Frameworks/R.framework/Versions/2.15/Resources/library/mypackage/extdata/bar.csv"

在我看来这很好的原因是您不不能对软件包中的路径功能进行硬编码,因此,相对于其他系统上的其他R安装,此方法应该是可靠的。

The reason this seems okay to me is that you don't hard code any path features other than those within your package, so this approach should be robust relative to other R installations on other systems.

data()语义而言,可以看出这是特定于顶级 data 目录中的R二进制文件( .RData )文件的。因此,您可以通过预先导入数据文件并使用 save()命令将它们保存到数据目录中来规避上面的示例。但是,这假定您只需要显示一个示例,其中数据已经加载到R中,而不是可重复地演示了导入文件的上游过程。

As for the data() semantics, as far as I can tell this is specific to R binary (.RData) files in the top-level data directory. So you can circumvent my example above by pre-importing the data files and saving them with the save() command into your data-directory. However, this assumes you only need to show an example in which the data is already loaded into R, as opposed to also reproducibly demonstrating the upstream process of importing the files.

这篇关于是否可以在testthat或run_examples()中使用R包数据?的文章就介绍到这了,希望我们推荐的答案对大家有所帮助,也希望大家多多支持IT屋!

查看全文
登录 关闭
扫码关注1秒登录
发送“验证码”获取 | 15天全站免登陆