如何在R CMD构建中使用Makefile [英] How to use Makefiles with R CMD build

查看:241
本文介绍了如何在R CMD构建中使用Makefile的处理方法,对大家解决问题具有一定的参考价值,需要的朋友们下面随着小编来一起学习吧!

问题描述

我正在开发R软件包.它基于仅使用Makefile的项目.大部分内容都可以轻松转换为R CMD build工作流程.但是,我需要创建的pdf有点复杂,除非修修补补,否则我不会正确处理它们-到目前为止,我已经找到了如何使用Makefile做到这一点.

I am developing a R package. It is based on a project that only used Makefile. Most of it easily translated to the R CMD build workflow. However the pdfs I need to create are a bit complex and I don't get them right unless I tinker - so far I figured how to do it with a Makefile.

在R包文档中,我找到了将Makefile用于来源的引用. >甚至小插曲.

In the R package documentations I find references to use Makefiles for sources and even for vignettes.

我不知道应该如何应用这些.从这些文档中,我印象中将在R CMD build的过程中调用Makefiles,但是当我将Makefile放入所描述的目录中时,它们将被忽略.但是R CMD check识别它们并输出通过测试.

I don't grasp how these should be applied. From these documentations I had the impression Makefiles would be called in the process of R CMD build but when I put Makefile in the described directories they are just ignored. However R CMD check recognises them and outputs passing tests.

我还看到了一些在内部调用R CMD build的Makefile-但是我一直想知道当我使用install.packages时这些文件将如何执行.这似乎并不正确-我的意思是为什么如果不关心,为什么会R CMD check这些.而且,R包中还有页面,其中涉及添加SystemRequiremens: GNU make-为什么要这样做对于您不使用的文件?

I also have seen some Makefiles that call R CMD build inside - but I keep wondering how these would execute when I use install.packages. That doesn't seem right - I mean why would R CMD check these if it wouldn't care about. And there's also this page in R packages about adding SystemRequiremens: GNU make - why do this for a file you don't use?

那么,如今的最佳实践是什么?还有我可以看看的野外例子吗?

So what is the best practice nowadays? And are there examples in the wild that I can look at?

更新

正如我所要求的一个例子

我想构建一个类似于编写程序包渐晕" .有一个主乳胶文件,其中包括几个Rnw文件. 具体的困境是:

I want to build a vignette as similar as described in "Writing package vignettes". There is a master Latex file which includes several Rnw files. The concrete dilemmas are:

  1. 如何构建pdf小插图?
  2. 我如何执行依赖关系-显然rnws首先需要呈现
  3. Rnw需要缓慢计算的数据,既不打算将其放入软件包中,也不打算存储在存储库中(既有数GB的数据),但在构建过程中会多次重用.

到目前为止,我使用Makefile进行操作,一般模式如下:

So far I do it with a Makefile, the general pattern is like this:

tmp/test.pdf: tmp/test.tex tmp/rnw1.tex tmp/rnw2.tex
    latexmk -outdir=$(@D) $<

tmp/%.tex: r/%.rnw
    Rscript -e "knitr::knit('$<', output='$@')"

tmp/rnw1.tex tmp/rnw2.tex: tmp/slowdata.Rdata

tmp/slowdata.Rdata: r/ireallytakeforever.R
    Rscript $<

推荐答案

Bdecaf,

好的,请回答2.0版-轻笑.

Ok, answer version 2.0 - chuckle.

您提到"问题是Makefile和软件包构建工作流应该如何结合在一起".在这种情况下,我的建议是您回顾一组示例R包的生成文件:

You mentioned that "The question is how Makefiles and the package build workflow are supposed to go together". In that context, my recommendation is you review a set of example R package makefiles:

  • Makefile /"rel =" noreferrer> Yihui Xie 的 knitr 软件包http://r-project.org/"rel =" noreferrer> R .
  • Makefile . com/kbroman/qtlcharts"rel =" noreferrer> R/qtlcharts 包.
  • Makefile for Yihui Xie's knitr package for R.
  • Makefile for my R/qtlcharts package.

knitrmakefile(在我看来)提供了如何构建小插图的很好的示例.您需要查看makefile和目录结构,这将是我建议您查看和使用的模板.

The knitr package makefile (in my view) provides a good example of how to build vignettes. You need to review the makefile and directory structure, that would be the template I would recommend you review and use.

我还建议您查看 maker ,这是用于R包开发的Makefile.最重要的是,我将从 Karl Broman 指南开始-(这就是我自己作为来源参考时至今日已被Hadley关于包装的书黯然失色,但仍然有用(在我看来).

I'd also recommend you look at maker, a Makefile for R package development. On top of this, I would start with Karl Broman guides - (this is what I used myself as a source reference a while back now eclipsed by Hadley's book on packages but still useful (in my view).

  • 最低要求:有关Make的最低要求
  • R软件包Primer .

另一个建议是阅读我之前引用的Rob Hynman的文章

The other recommendation is to read Rob Hynman's article I referenced previously

在它们之间,您应该能够完成您所要求的.除此以外,您还拥有参考的基本R软件包手册.

between them, you should be able to do what you request. Above and beyond that you have the base R package manual you referenced.

希望以上内容对您有所帮助.

I hope the above helps.

T.

我认为,可重复研究最重要的工具不是 Sweave 编织器,但 GNU make .

I would argue that the most important tool for reproducible research is not Sweave or knitr but GNU make.

例如,考虑与手稿相关的所有文件.在最简单的情况下,我每个图都有一个 R 脚本以及一个 BibTeX 文件作为参考.

Consider, for example, all of the files associated with a manuscript. In the simplest case, I would have an R script for each figure plus a LaTeX file for the main text. And then a BibTeX file for the references.

编译最终的PDF需要一些工作:

Compiling the final PDF is a bit of work:

  • 通过R运行每个R脚本以产生相关的图形.
  • 先运行乳胶,然后再运行bibtex,然后再运行乳胶两次.

R脚本需要在乳胶之前运行,并且只有在它们已更改的情况下才能运行.

And the R scripts need to be run before latex is, and only if they’ve changed.

GNU make 使此操作变得容易.在手稿目录中,创建一个名为Makefile的文本文件,其外观类似于 pdflatex ).

GNU make makes this easy. In your directory for the manuscript, you create a text file called Makefile that looks something like the following (here using pdflatex).

mypaper.pdf: mypaper.bib mypaper.tex Figs/fig1.pdf Figs/fig2.pdf
    pdflatex mypaper
    bibtex mypaper
    pdflatex mypaper
    pdflatex mypaper

Figs/fig1.pdf: R/fig1.R
    cd R;R CMD BATCH fig1.R

Figs/fig2.pdf: R/fig2.R
    cd R;R CMD BATCH fig2.R

每行代码分别表示要创建的文件(目标),其依赖的文件(前提条件),然后是从依赖文件构造目标所需的一组命令.请注意,带有命令的行必须以 tab 字符(不能为空格)开头.

Each batch of lines indicates a file to be created (the target), the files it depends on (the prerequisites), and then a set of commands needed to construct the target from the dependent files. Note that the lines with the commands must start with a tab character (not spaces).

另一个很棒的功能:在上面的示例中,仅当fig1.R更改时,您才构建fig1.pdf.并注意依赖关系会传播.如果更改fig1.R,则fig1.pdf将更改,因此将重新构建mypaper.pdf.

Another great feature: in the example above, you’d only build fig1.pdf when fig1.R changed. And note that the dependencies propagate. If you change fig1.R, then fig1.pdf will change, and so mypaper.pdf will be re-built.

一个奇怪的问题:如果需要更改目录以运行命令,请在与相关命令相同的行上执行cd.以下内容不起作用:

One oddity: if you need to change directories to run a command, do the cd on the same line as the related command. The following would not work:

### this doesn't work ###
Figs/fig1.pdf: R/fig1.R
    cd R
    R CMD BATCH fig1.R
You can, however, use \ for a continuation line, line so:

### this works ###
Figs/fig1.pdf: R/fig1.R
    cd R;\
    R CMD BATCH fig1.R

请注意,您仍然需要使用分号(;).

Note that you still need to use the semicolon (;).

您可能已经在计算机上安装了GNU make.在终端/外壳中键入make --version以查看. (在Windows上,请转到此处下载make.)

You probably already have GNU make installed on your computer. Type make --version in a terminal/shell to see. (On Windows, go here to download make.)

要使用make:

  • 进入项目目录.
  • 创建Makefile文件.
  • 每次要构建项目时,键入make.
  • 在上面的示例中,如果您要构建fig1.pdf而无需构建mypaper.pdf,则只需键入make fig1.pdf.

仅使用上述简单的make文件就可以走很长一段路,指定目标文件,它们的依赖关系以及创建它们的命令.但是,您可以添加很多装饰来节省键入内容.

You can go a long way with just simple make files as above, specifying the target files, their dependencies, and the commands to create them. But there are a lot of frills you can add, to save some typing.

以下是我使用的一些选项. (有关更多详细信息,请参见制作文档.)

Here are some of the options that I use. (See the make documentation for further details.)

如果您要重复多次相同的代码,则可能需要定义一个变量.

If you’ll be repeating the same piece of code multiple times, you might want to define a variable.

例如,您可能想使用--vanilla标志运行R.然后,您可以定义一个变量R_OPTS:

For example, you might want to run R with the flag --vanilla. You could then define a variable R_OPTS:

R_OPTS =-香草 您将此变量称为$(R_OPTS)(或$ {R_OPTS};允许使用括号或花括号),因此在R命令中,您将使用类似

R_OPTS=--vanilla You refer to this variable as $(R_OPTS) (or ${R_OPTS}; either parentheses or curly braces is allowed), so in the R commands you would use something like

cd R; R CMD BATCH $(R_OPTS)图1.R 这样做的好处是您只需键入一次所需的选项即可.如果您对要使用的R选项改变了主意,则只需在一个地方进行更改即可.

cd R;R CMD BATCH $(R_OPTS) fig1.R An advantage of this is that you just need to type out the options you want once; if you change your mind about the R options you want to use, you just have to change them in the one place.

例如,我实际上喜欢使用以下内容:

For example, I actually like to use the following:

R_OPTS =-不保存-不恢复-不初始化文件-不站点文件 这就像--vanilla,但没有--no-environ(我需要这样做,因为我使用.Renviron文件定义R_LIBS,也就是说我在备用目录中定义了R包).

R_OPTS=--no-save --no-restore --no-init-file --no-site-file This is like --vanilla but without --no-environ (which I need because I use the .Renviron file to define R_LIBS, to say that I have R packages defined in an alternative directory).

有一堆自动变量,您可以使用它们来节省很多输入时间.这是我最常使用的:

There are a bunch of automatic variables that you can use to save yourself a lot of typing. Here are the ones that I use most:

$@    the file name of the target
$<    the name of the first prerequisite (i.e., dependency)
$^    the names of all prerequisites (i.e., dependencies)
$(@D)    the directory part of the target
$(@F)    the file part of the target
$(<D)    the directory part of the first prerequisite (i.e., dependency)
$(<F)    the file part of the first prerequisite (i.e., dependency)

例如,在我们的简单示例中,我们可以简化行

For example, in our simple example, we could simplify the lines

Figs/fig1.pdf: R/fig1.R
    cd R;R CMD BATCH fig1.R

我们可以改写

Figs/fig1.pdf: R/fig1.R
    cd $(<D);R CMD BATCH $(<F)

自动变量$(<D)将采用第一个必备条件的目录的值,在这种情况下为R. $(<F)将采用第一个先决条件的文件部分的值,在这种情况下为fig1.R.

The automatic variable $(<D) will take the value of the directory of the first prerequisite, R in this case. $(<F) will take value of the file part of the first prerequisite, fig1.R in this case.

好吧,这并不是一个简单的简化.除非目录是一个令人讨厌的长字符串,并且我们希望避免两次键入,否则似乎没有太多好处.主要优势在于下一部分.

Okay, that’s not really a simplification. There doesn’t seem to be much advantage to this, unless perhaps the directory were an obnoxiously long string and we wanted to avoid having to type it twice. The main advantage comes in the next section.

如果要以相同方式构建多个文件,则可能要使用样式规则.关键思想是可以将符号%用作通配符,以扩展为任意文本字符串.

If a number of files are to be built in the same way, you may want to use a pattern rule. The key idea is that you can use the symbol % as a wildcard, to be expanded to any string of text.

例如,我们两个人物的建造基本上是相同的.我们可以通过包含一组同时覆盖fig1.pdf和fig2.pdf的行来简化示例:

For example, our two figures are being built in basically the same way. We could simplify the example by including one set of lines covering both fig1.pdf and fig2.pdf:

Figs/%.pdf: R/%.R
    cd $(<D);R CMD BATCH $(<F)

这样可以节省键入内容,并使文件更易于维护和扩展.如果要添加第三个图形,只需将其添加为mypaper.pdf的另一个依赖项(即先决条件).

This saves typing and makes the file easier to maintain and extend. If you want to add a third figure, you just add it as another dependency (i.e., prerequisite) for mypaper.pdf.

我们的示例,带有褶边

将所有这些内容加在一起,这就是我们的示例Makefile将看起来像.

Adding all of this together, here’s what our example Makefile will look like.

R_OPTS=--vanilla

mypaper.pdf: mypaper.bib mypaper.tex Figs/fig1.pdf Figs/fig2.pdf
    pdflatex mypaper
    bibtex mypaper
    pdflatex mypaper
    pdflatex mypaper

Figs/%.pdf: R/%.R
    cd $(<D);R CMD BATCH $(R_OPTS) $(<F)

增加了褶边的优点:键入更少,并且更容易扩展以包括其他数字.缺点:对于不太熟悉 GNU Make 的其他人来说,了解它的工作将变得更加困难.

The advantage of the added frills: less typing, and it’s easier to extend to include additional figures. The disadvantage: it’s harder for others who are less familiar with GNU Make to understand what it’s doing.

更复杂的示例

到处都是复杂的Makefile.在github上戳一下并研究它们.

There are complicated Makefiles all over the place. Poke around github and study them.

以下是我自己的一些示例:

Here are some of my own examples:

  • Makefile for my AIL probabilities paper
  • Makefile for my phylo QTL paper
  • Makefile for my pre-CC probabilities paper
  • Makefile for a talk on interactive graphs.
  • Makefile for a talk on QTL mapping for function-valued traits.
  • Makefile for my R/qtlcharts package.

以下是Mike Bostock的一些示例:

And here are some examples from Mike Bostock:

  • Makefile .com/mbostock/us-rivers"rel =" noreferrer>我们的河流
  • Makefile /mbostock/protovis"rel =" noreferrer> protovis
  • Makefile /mbostock/topotree"rel =" noreferrer>拓扑树
  • Makefile for us-rivers
  • Makefile for protovis
  • Makefile for topotree

还要查看 Makefile 中的 knitr 包用于 R .

Also look at the Makefile for Yihui Xie’s knitr package for R.

感兴趣的还有 maker ,这是用于R包开发的Makefile.

Also of interest is maker, a Makefile for R package development.

  • GNU make webpage
  • Official manual
  • O’Reilly Managing projects with GNU make book (part of the Open Books project)
  • Software carpentry’s make tutorial
  • Mike Bostock’s "Why Use Make"
  • GNU Make for reproducible data analysis by Zachary Jones
  • Makefiles for R/LaTeX projects by Rob Hyndman

R包是分发R代码和文档的最佳方法, 并且,尽管有官方手册的印象 (编写R扩展) 可能会给出,它们的创建确实非常简单.

R packages are the best way to distribute R code and documentation, and, despite the impression that the official manual (Writing R Extensions) might give, they really are quite simple to create.

即使不打算使用的代码,也应该制作一个R包 分发.您会发现跟踪自己的内容更容易 个人R功能(如果位于包装中).而且写得很好 文档,即使只是为了将来的自己.

You should make an R package even for code that you don't plan to distribute. You'll find it is easier to keep track of your own personal R functions if they are in a package. And it's good to write documentation, even if it's just for your future self.

Hadley Wickham 写道 关于R包的书(在线免费;另请参见 可从纸质形式获得 Amazon ).你 可能会直接跳到那里.

Hadley Wickham wrote a book about R packages (free online; also available in paper form from Amazon). You might just jump straight there.

希拉里·帕克写道 有关编写R包的简短教程 . 如果您想要速成课程,则应该从那里开始.很多人 已根据她的指示成功构建了R包.

Hilary Parker wrote a short and clear tutorial on writing R packages. If you want a crash course, you should start there. A lot of people have successfully built R packages from her instructions.

但是具有以下多样性是有价值的 资源,所以我想继续写我自己的最小教程. 以下主题看起来令人讨厌,但每个主题都很简短, 直截了当(并​​希望清晰)如果您被清单推迟 的主题, 而且你还没有抛弃我来支持 哈德利的书,那你为什么不读 希拉里的教程?

But there is value in having a diversity of resources, so I thought I'd go ahead and write my own minimal tutorial. The following list of topics looks forbidding, but each is short and straightforward (and hopefully clear). If you're put off by the list of topics, and you've not already abandoned me in favor of Hadley's book, then why aren't you reading Hilary's tutorial?

如果有人仍然在我身边,以下几页将介绍以下内容的基本知识: 制作R包.

If anyone's still with me, the following pages cover the essentials of making an R package.

  • Why write an R package?
  • The minimal R package
  • Building and installing an R package
  • Making it a proper package
  • Writing documentation with Roxygen2
  • Software licenses
  • Checking an R package

以下内容很重要,但不是必需的.

The following are important but not essential.

  • Putting it on GitHub
  • Getting it on CRAN
  • Writing vignettes
  • Writing tests
  • Including datasets
  • Connecting to other packages

以下包含其他资源的链接:

The following contains links to other resources:

如果此处有任何令人困惑的内容(或错误!),或者我错过了 请提供重要细节 提交问题,或者(甚至 更好)fork 此网站的GitHub存储库, 进行修改,然后提交拉取请求.

If anything here is confusing (or wrong!), or if I've missed important details, please submit an issue, or (even better) fork the GitHub repository for this website, make modifications, and submit a pull request.

本教程的来源是github上的 .

The source for this tutorial is on github.

另请参阅我的教程 git/github GNU make 针织器使用GitHub Pages建立网站数据组织, 和可重复的研究.

Also see my tutorials on git/github, GNU make, knitr, making a web site with GitHub Pages, data organization, and reproducible research.

这篇关于如何在R CMD构建中使用Makefile的文章就介绍到这了,希望我们推荐的答案对大家有所帮助,也希望大家多多支持IT屋!

查看全文
登录 关闭
扫码关注1秒登录
发送“验证码”获取 | 15天全站免登陆