使R包易于使用用户的新文件进行更新 [英] Make R package easy to update with new files from users

查看:121
本文介绍了使R包易于使用用户的新文件进行更新的处理方法,对大家解决问题具有一定的参考价值,需要的朋友们下面随着小编来一起学习吧!

问题描述

首先让我解释一下我来自Python世界,在那里我可以在shell中做我想做的事情:

$ export PYTHONPATH=~/myroot
$ mkdir -p ~/myroot/mypkg
$ touch ~/myroot/mypkg/__init__.py # this is the one bit of "magic" for Python
$ echo 'hello = "world"' > ~/myroot/mypkg/mymodule.py

然后用Python:

>>> import mypkg.mymodule
>>> mypkg.mymodule.hello
'world'

我在那里所做的是创建一个易于被其他用户扩展的软件包.我可以将〜/myroot/mypkg签入到源代码管理中,其他用户以后可以仅使用文本编辑器向其中添加模块.现在,我想在R中做等效的事情.这是到目前为止的内容:

$ export R_LIBS=~/myR # already this is bad: it makes install.packages() put things here!
$ mkdir -p ~/myR
$ echo 'hello = "world"' > /tmp/mycode.R

现在在R中:

> package.skeleton(name="mypkg", code_files="/tmp/mycode.R")

现在回到外壳:

$ R CMD build mypkg
$ R CMD INSTALL mypkg

现在回到R:

> library(mypkg)
> hello
"world"

这样行得通.但是现在,我的同事如何在此软件包中添加新模块?我希望他们能够进入并添加一个新的R文件,但是似乎我们接下来必须重做整个程序包构建过程,这很繁琐.看来我们随后将最终签入许多生成的文件.但是最重​​要的是,代码在哪里去了?一旦我做了CMD INSTALL,R就知道我的代码是什么,但是它没有将文字文本(来自mycode.R的文本)放在$R_LIBS下的任何位置(它是否编译"了代码?我不确定). /p>

以前,我们只是source()我们的模块",但这并不是很好,因为它每次都会重新加载代码,因此间接(传递)依赖项最终会重新加载已经加载的内容.

我的问题是,人们如何在R中管理简单的,内部的,协同编辑的,源代码控制的,非二进制的,非编译的,共享的代码?

我在Linux上使用R 3.1.1.如果该解决方案也可以在Windows上运行,那就太好了.

解决方案

R似乎不像Python的import语句,所以我做了我自己的.只需将其放入类似import.r的文件中,然后通过您的$R_PROFILE来源即可.

# this is sort of like Python's import statement, and lets us avoid redundant sourcing

.imports <- c("import") # module names imported so far, to avoid redundant imports (never import ourselves)

.importScriptPath <- function() {
  # returns the path of the executing script
  # see http://stackoverflow.com/questions/1815606

  # this will only work if the caller was loaded with source()
  filePath <- sys.frame(2)$ofile

  # if the caller was not loaded with source(), use the main script path
  if (length(filePath) == 0) {
    argv <- commandArgs(trailingOnly = FALSE)
    filePath <- substring(argv[grep("--file=", argv)], 8)
  }

  return (dirname(filePath))
}

import <- function(module) {
  # locates the given module (character or token), calls source() on it, and does nothing on subsequent calls

  module <- as.character(substitute(module)) # support import(foo) not only import("foo")

  if (module %in% .imports) {
    return(invisible())
  }

  moduleFilename <- paste0(gsub("\\.", "/", module), ".r") # allow import(foo.bar) as import("foo/bar")
  importPaths <- c(.importScriptPath()) # add more search paths here as desired

  for (importPath in importPaths) {
    modulePath <- file.path(importPath, moduleFilename)
    if (file.exists(modulePath)) {
      source(modulePath)
      .imports <<- append(.imports, module) # <<- updates the global variable so we skip it next time
      return(invisible())
    }
  }

  # last chance: try to load module as a standard library
  suppressPackageStartupMessages(library(module, character.only = TRUE))
  .imports <<- append(.imports, module)
  return(invisible())
}

First let me explain that I come from the Python world, where I can do what I want like this in the shell:

$ export PYTHONPATH=~/myroot
$ mkdir -p ~/myroot/mypkg
$ touch ~/myroot/mypkg/__init__.py # this is the one bit of "magic" for Python
$ echo 'hello = "world"' > ~/myroot/mypkg/mymodule.py

Then in Python:

>>> import mypkg.mymodule
>>> mypkg.mymodule.hello
'world'

What I did there was to create a package which is easily extended by other users. I can check in ~/myroot/mypkg to source control and other users can later add modules to it using just a text editor. Now I want to do the equivalent thing in R. Here's what I have so far:

$ export R_LIBS=~/myR # already this is bad: it makes install.packages() put things here!
$ mkdir -p ~/myR
$ echo 'hello = "world"' > /tmp/mycode.R

Now in R:

> package.skeleton(name="mypkg", code_files="/tmp/mycode.R")

Now back to the shell:

$ R CMD build mypkg
$ R CMD INSTALL mypkg

Now back to R:

> library(mypkg)
> hello
"world"

So that works. But now how do my colleagues add new modules to this package? I want them to be able to just go in and add a new R file, but it seems like we then have to redo the entire package building process, which is tedious. And it seems like we will then end up checking in many generated files. But most importantly, where did the code go? Once I did CMD INSTALL, R knew what my code was, but it did not put the literal text (from mycode.R) anywhere under $R_LIBS (did it "compile" the code? I'm not sure).

Previously we would just source() our "modules" but this is not very good because it reloads the code every time, so indirect (transitive) dependencies end up reloading stuff that is already loaded.

My question is, how do people manage simple, in-house, collaboratively edited, source-controlled, non-binary, non-compiled, shared code in R?

I'm using R 3.1.1 on Linux. If the solution works on Windows too that would be nice.

解决方案

It seems R has nothing like Python's import statement, so I made my own. Just put this in a file like import.r and source it via your $R_PROFILE.

# this is sort of like Python's import statement, and lets us avoid redundant sourcing

.imports <- c("import") # module names imported so far, to avoid redundant imports (never import ourselves)

.importScriptPath <- function() {
  # returns the path of the executing script
  # see http://stackoverflow.com/questions/1815606

  # this will only work if the caller was loaded with source()
  filePath <- sys.frame(2)$ofile

  # if the caller was not loaded with source(), use the main script path
  if (length(filePath) == 0) {
    argv <- commandArgs(trailingOnly = FALSE)
    filePath <- substring(argv[grep("--file=", argv)], 8)
  }

  return (dirname(filePath))
}

import <- function(module) {
  # locates the given module (character or token), calls source() on it, and does nothing on subsequent calls

  module <- as.character(substitute(module)) # support import(foo) not only import("foo")

  if (module %in% .imports) {
    return(invisible())
  }

  moduleFilename <- paste0(gsub("\\.", "/", module), ".r") # allow import(foo.bar) as import("foo/bar")
  importPaths <- c(.importScriptPath()) # add more search paths here as desired

  for (importPath in importPaths) {
    modulePath <- file.path(importPath, moduleFilename)
    if (file.exists(modulePath)) {
      source(modulePath)
      .imports <<- append(.imports, module) # <<- updates the global variable so we skip it next time
      return(invisible())
    }
  }

  # last chance: try to load module as a standard library
  suppressPackageStartupMessages(library(module, character.only = TRUE))
  .imports <<- append(.imports, module)
  return(invisible())
}

这篇关于使R包易于使用用户的新文件进行更新的文章就介绍到这了,希望我们推荐的答案对大家有所帮助,也希望大家多多支持IT屋!

查看全文
登录 关闭
扫码关注1秒登录
发送“验证码”获取 | 15天全站免登陆