如何以相对方式使用setwd? [英] How do I use setwd in a relative way?

查看:91
本文介绍了如何以相对方式使用setwd?的处理方法,对大家解决问题具有一定的参考价值,需要的朋友们下面随着小编来一起学习吧!

问题描述

我们的团队在git repos中使用R脚本,这些脚本在Mac和Windows(有时是Linux)计算机上由多个人共享.这往往会导致在脚本顶部出现一堆非常烦人的行,如下所示:

Our team uses R scripts in git repos that are shared between several people, across both Mac and Windows (and occasionally Linux) machines. This tends to lead to a bunch of really annoying lines at the top of scripts that look like this:

#path <- 'C:/data-work/project-a/data'
#path <- 'D:/my-stuff/project-a/data'
path = "~/projects/project-a/data"
#path = 'N:/work-projects/project-a/data'
#path <- "/work/project-a/data"
setwd(path)

要运行脚本,我们必须注释/取消注释正确的路径变量,否则脚本将无法运行.这很烦人,不整洁,而且在提交历史记录中也很混乱.

To run the script, we have to comment/uncomment the correct path variable or the scripts won't run. This is annoying, untidy, and tends to be a bit of a mess in the commit history too.

过去,通过使用shell脚本来设置相对于脚本位置的目录并完全跳过setwd(然后使用./run-scripts.sh而不是Rscript process.R),可以避免这种情况,但是由于Windows用户在这里,那是行不通的.是否有更好的方法来简化R中这些凌乱的setwd()样板?

In past I've got round this by using shell scripts to set directories relative to the script's location and skipping setwd entirely (and then using ./run-scripts.sh instead of Rscript process.R), but as we've got Windows users here, that won't work. Is there a better way to simplify these messy setwd() boilerplates in R?

(旁注:在Python中,我通过使用路径库来获取脚本文件本身的位置,然后从中建立相对路径来解决此问题.但是R似乎没有办法获取位置正在运行的脚本文件的内容?)

(side note: in Python, I solve this by using the path library to get the location of the script file itself, and then build relative paths from that. But R doesn't seem to have a way to get the location of the running script's file?)

推荐答案

答案是永远不要使用setwd().当然,R的功能与Python有所不同,但这是它们的共同点.

The answer is to not use setwd() at all, ever. R does things a bit different than Python, for sure, but this is one thing they have in common.

相反,您正在执行的任何脚本都应假定它们是从公共的顶级根文件夹运行的.启动新的R进程时,其工作目录(即getwd()给出的目录)设置为与产生该进程的目录相同的文件夹.

Instead, any scripts you're executing should assume they're being run from a common, top-level, root folder. When you launch a new R process, its working directory (i.e., what getwd() gives) is set to the same folder as the process was spawned from.

例如,如果您使用以下布局:

As an example, if you had this layout:

.
├── data
│   └── mydata.csv
└── scripts
    └── analysis.R

您将从.运行analysis.R,并且analysis.Rdata/mydata.csv引用为"data/mydata.csv"(例如,read.csv("data/mydata.csv, stringsAsFactors = FALSE)). 我会保留运行R脚本的Shell脚本或Makefile,并让R脚本假定它们是从git存储库的顶层运行的.

You would run analysis.R from . and analysis.R would reference data/mydata.csv as "data/mydata.csv" (e.g., read.csv("data/mydata.csv, stringsAsFactors = FALSE)). I would keep your shell scripts or Makefiles that run your R scripts and have the R scripts assume they're being run from the top level of the git repo.

这可能看起来像:

cd . # Whereever `.` above is
Rscript scripts/analysis.R

进一步阅读:

  • https://www.tidyverse.org/articles/2017/12/workflow-vs-script/
  • https://github.com/jennybc/here_here

这篇关于如何以相对方式使用setwd?的文章就介绍到这了,希望我们推荐的答案对大家有所帮助,也希望大家多多支持IT屋!

查看全文
登录 关闭
扫码关注1秒登录
发送“验证码”获取 | 15天全站免登陆