如何最好地基于一个数据集从RMarkdown生成多个HTML文件? [英] How to best generate multiple HTML files from RMarkdown based on one dataset?

查看:196
本文介绍了如何最好地基于一个数据集从RMarkdown生成多个HTML文件?的处理方法,对大家解决问题具有一定的参考价值,需要的朋友们下面随着小编来一起学习吧!

问题描述

我有一个RMarkdown报告,该报告非常有用,并且已经成长为几页长,HTML文件中的所有图形和表格都如此.

I have an RMarkdown report that is very useful and has grown to be several pages long with all the figures and tables in the HTML file.

它对所有图形和表格使用相同的数据集.

It uses the same dataset for all the figures and tables.

我想做的是继续生成这个大的html文件,然后生成几个新的子目录,每个子目录都有自己的html文件,而其中的子目录又有各自的html文件.

What I would like to do is to keep generating this large html file and then several new subdirectories, each with their own html files and subdirectories within those, each with their own html files.

在这种情况下,完整报告包含有关部门的数据,然后每个子目录将包含与部门内每个组相关的html输出,并且每个子目录将包含一个子目录,每个组中每个人的html输出.这样,如果某人只对一个组或一个人的度量感兴趣,那么他们将查看最合适的输出.

In this case, the full report contains data on a department, then each subdirectory would contain an html output related to each group within the department, and each of those would contain a subdirectory with html output for each person in each group. This way if someone is only interested in the metrics of one group, or one person, they look at the most appropriate output.

Parent dir: The same large html file with figures and tables generated with data for entire dept.
|
 __Subdir for each group: Output based on same data but only the group's metrics
    |
     __Subdir for each person: Output based on same data but only individual's metrics

最好的安排方式是什么?
1. RMardkown中是否有代码块选项,我可以说,块a进入该html输出文件,块b进入另一个html输出文件?
2.我是否需要多个RMarkdown文件,每个html输出一个,在它们之间进行某种形式的缓存,所以我不必重新处理所有数据? (这似乎很愚蠢,因为我需要很多html文件)
3.我应该放弃RMarkdown来完成这项任务吗?

What's the best way to arrange this?
1. Is there a code chunk option in RMardkown where I can say, chunk a goes in this html output file, chuck b goes in another?
2. Do I need multiple RMarkdown files, one for each html output, witch some sort of caching between them so I don't have to reprocess all the data? (this would seem silly because I need a lot of html files)
3. Should I give up RMarkdown for this task?

推荐答案

我做的事情就像您对knitr提出的建议一样,并且效果很好.

I do something like you're proposing with knitr, and it works very well.

不要告诉任何人,但是我使用"for"循环在多个理事会中循环,每个理事会都有相同的报告,但包含其数据.然后,我将报告推送到目录结构中,将其压缩并邮寄.

Don't tell anyone, but I use a 'for' loop to cycle through a bunch of councils, each of whom get the same report but with their data. I then push the report into a directory structure, zip it and mail it.

我有一个Rmd文件,该文件需要两个数据集,即setA(作为主题)和setB(作为其对等对象)

I have an Rmd file that expects two datasets, setA (being the subject) and setB (being its peers)

流程类似于:

set <- assemble_data() # loads whole set
for (report in report_list) {
    setA <- filter(set, subject == report)
    setB <- filter(set, subject != report)
    output_html <- str_c('path/',report,'.html')
    knit_interim <- str_c('path/',report,'md')
    knit_pattern <- 'name of RMd' # I generate more than one report for each place
    knit(knit_pattern) 
    markdowntoHTML(file = knit_interim, output=output_html, stylesheet=stylesheet, encoding='windows-1252')
}

这样,我可以在几分钟内生成报告集.我的案例可能比您的案例简单,因为报表结构是相同的-只是数据集发生了变化.

In this way I can produce a report set in a few minutes. My case may be simpler than yours, because the report structure is the same - it's the datasets that change.

请注意,这不是代码的粘贴(比这稍微复杂一点),所以请注意错别字等.

Note that this is not a paste of the code (it is slightly more complicated than this) so beware typos etc.

(据我所知),重点是编写一个Rmd,该Rmd期望一个特定名称的数据集,而R代码为其提供了本地作用域.最初,我为之苦苦挣扎,但执行起来却非常简单.

The point (as I understand it) is to write an Rmd that expects a dataset of a particular name, and the R code provides local scope for it. I struggled initially with it but it's all quite simple in its execution.

[更新:如何将数据传递到RMd文件?"

[update: 'How do you pass the data to the RMd files ?'

您不需要明确.在我上面的代码中,RMd写入了setA和setB中的期望数据.

You don't explicitly need to. In my code above the RMd is written expecting data in setA and setB.

这使工作流程变得非常简单-您使用数据集编写模板(手动过滤一个),然后在准备就绪时就可以运行循环.就像我说的那样,我一开始有点儿难以理解,但是刚加入时一切都很好.

It makes the workflow really easy - you write the template using a dataset (manually filter for one) and then when you're ready you can just run the loop. Like I said, I struggled a bit to understand at first but just jumped in and it all worked out quite nicely.

这篇关于如何最好地基于一个数据集从RMarkdown生成多个HTML文件?的文章就介绍到这了,希望我们推荐的答案对大家有所帮助,也希望大家多多支持IT屋!

查看全文
登录 关闭
扫码关注1秒登录
发送“验证码”获取 | 15天全站免登陆