如何为R中的数据帧的每一行生成降价文档 [英] How to produce markdown document for each row of dataframe in R

查看:81
本文介绍了如何为R中的数据帧的每一行生成降价文档的处理方法,对大家解决问题具有一定的参考价值,需要的朋友们下面随着小编来一起学习吧!

问题描述

我想从数据帧的每一行中生成1个带有子文档的降价文档,或者从数据帧中生成不计其数的降价文档.降价文件是template.Rmd.

我认为创建一个for循环应该起作用,但是当我尝试执行此操作时,by(dataFrame, 1:nrow(dataFrame), function(row) knit(file = "/Users/path/template.Rmd"))我收到一个输入意外结束的错误.

Quitting from lines 23-26 (Preview-e0d353674d36.Rmd) 
Error in knit(file = "/Users/path/template.Rmd") : 
  unused argument (file = "/Users/path/template.Rmd")
Calls: <Anonymous> ... eval -> eval -> tapply -> lapply -> FUN -> FUN -> knit

Execution halted

我尝试使用@Yihui解决的相同方法,以编程方式引用带有knitr-expand的文本,详情如下:

模板如下:

```{r, results='asis'}
cat("### {{ncyl}} cylinders")
```

```{r mpg-histogram-{{ncyl}}cyl}
hist(mtcars$mpg[mtcars$cyl == {{ncyl}}], 
  main = paste({{ncyl}}, "cylinders"))
```

```{r weight-histogam-{{ncyl}}cyl}
hist(mtcars$wt[mtcars$cyl == {{ncyl}}], 
  main = paste({{ncyl}}, "cylinders"))
```

此解决方案针对每个级别的圆柱体生成一个带有单个文档的降价文档,该文档带有一个子文档(标题级别2).但是,我试图创建一个获取.csv的报表,然后创建和修改一个数据框并为另一个数据框的每一行生成内容.

我想坚持的是如何使用{{ncyl}}中的值以编程方式引用数据库的行.我希望能够使用{{ncyl}}的级别去处理数据帧mtcars中的相关行(假设在此示例中,它只有行==级别{{ncyl}}). /p>

虽然data(mtcars)的行数比cylyinder的级数多,但R将{{ncyl}}的值存储为整数.因此,您可以调用mtcars$gear[[{{ncyl}}]]并获取{{ncyl}}行的gear值.

那为什么当我们将其添加到模板中时,Rmd会失败?

请原谅我,它不会失败,它将为我们提供gear <- mtcars$gear[[{{ncyl}}]],但我们随后便无法创建齿轮,例如```{r this-gear-{{gear}}}.

这有效

```{r}
gear <- mtcars$gear[[{{ncyl}}]]
gear
```

```{r, results='asis'}
cat("### {{ncyl}} cylinders")
```

```{r mpg-histogram-{{ncyl}}cyl}
hist(mtcars$mpg[mtcars$cyl == {{ncyl}}], 
  main = paste({{ncyl}}, "cylinders"))
```

```{r weight-histogam-{{ncyl}}cyl}
hist(mtcars$wt[mtcars$cyl == {{ncyl}}], 
  main = paste({{ncyl}}, "cylinders"))
```

这不起作用

```{r}
gear <- mtcars$gear[[{{ncyl}}]]
gear
```

```{r, results='asis'}
cat("### {{ncyl}} cylinders")
```

```{r mpg-histogram-{{ncyl}}cyl}
hist(mtcars$mpg[mtcars$cyl == {{ncyl}}], 
  main = paste({{ncyl}}, "cylinders"))
```

```{r weight-histogam-{{ncyl}}cyl}
hist(mtcars$wt[mtcars$cyl == {{ncyl}}], 
  main = paste({{ncyl}}, "cylinders"))
```
```{r {{gear}}}
gear
```

给出错误

Quitting from lines 10-12 (Preview-e0d32d687661.Rmd) 
Error in eval(expr, envir, enclos) : object 'gear' not found
Calls: <Anonymous> ... knit_expand -> inline_exec -> withVisible -> eval -> eval
Execution halted

我认为我正在解决主要问题如何为数据框的每一行创建降价文档?"编织展开功能有误.

有人可以帮助我了解以下内容: 1.如何解决主要问题 2.为什么{{gear}}在template.Rmd中不起作用?

所以,我仍然不了解(2),但是我认为@daroczig使我接近了解解决主要问题的一种方法.我认为这不是问题的独特之处,我认为没有brewpanderrapport的方法也可以解决.无论如何,我都采用了brew方法,并使用了几行数据帧.它抛出一个错误.注意,我对这段代码没有做任何明智的事情,只是将mtcars限制为3行,这样我就不会得到太多输出,然后在 for 循环中创建另一个la脚的数据帧.

# My report

<%
mtcars1 <- mtcars[1:3,]
mtcars1$type <- c('red','blue','green')
t.levels <- unique(mtcars1$type)
for (ty in t.levels) {
p <- subset(mtcars1,type == ty) 
x <- rep(p, 4)
short <- paste0(p$gear, p$mpg)
%>

### <%= short %> blah

<%=
hist(x$mpg, main = paste(short, "blah"))
%>

<% } %>

这只是@daroczig在下面提出的解决方案的一点la脚修改.如果我们将其命名为demo.brew并从 Pandoc.brew('demo.brew', output = tempfile(), convert = 'html')调用它,它将起作用.举一个愚蠢的例子.

(3)是否有一个示例,说明如何不进行冲泡?我很好奇.

回答(3)是.这与for循环一起使用,该循环调用变量而不是行num

varlist <- unique(df$variable)
for (var in varlist) {
    try(knit2html(input= '/Users/path/template.Rmd',
                  output=paste0('/Users/path/template',var,'.html'))) 

在没有通过1:nrow()循环的地方工作.

解决方案

使用pander的另一种解决方案-基于我的上述评论:

# My report

<%
cyl.levels <- unique(mtcars$cyl)
for (ncyl in cyl.levels) {
%>

### <%= ncyl %> cylinders

<%=
hist(mtcars$mpg[mtcars$cyl == ncyl], main = paste(ncyl, "cylinders"))
hist(mtcars$wt[mtcars$cyl == ncyl], main = paste(ncyl, "cylinders"))
%>

<% } %>

要制作此文件(命名为demo.brew),请运行:

Pandoc.brew('demo.brew')

或者例如一个MS Word文档:

Pandoc.brew('demo.brew', output = tempfile(), convert = 'docx')


更新:我刚刚意识到您需要针对类别的单独文档.为此,我建议尝试使用另一个软件包rapport,该软件包专注于准确的统计报告模板.快速示例:

<!--head
meta:
  title: Demo for @Jessi
  author: daroczig
  description: This is a demo
  packages: ~
inputs:
- name: ncyl
  class: integer
  standalone: TRUE
  required: TRUE
head-->

### <%= ncyl %> cylinders

<%=
hist(mtcars$mpg[mtcars$cyl == ncyl], main = paste(ncyl, "cylinders"))
hist(mtcars$wt[mtcars$cyl == ncyl], main = paste(ncyl, "cylinders"))
%>

因此,以上文档(demo.rapport)是一个 rapport 模板,该模板具有用于元数据和输入(其行为类似于R函数中的参数/参数),然后主体可以使用panderbrew语法包括markdown和R代码.现在,您可以通过简单的调用轻松调用此报告模板,例如4个气缸:

> rapport('demo.rapport', ncyl = 4)

### _4_ cylinders

![](plots/rapport--home-daroczig-projects-demo.rapport-6-1.png)
![](plots/rapport--home-daroczig-projects-demo.rapport-6-2.png)

要为所有气缸生成MS Word文件,请尝试以下操作:

for (ncyl in (2:4)*2) {
    rapport.docx('/home/daroczig/projects/demo.rapport', ncyl = ncyl)
}

I would like to produce either 1 markdown document with subdocuments from each row of a dataframe or produce nrows number of markdown documents from a dataframe. The markdown document is template.Rmd.

I think that it should work to create a for loop, but when I try to do this, by(dataFrame, 1:nrow(dataFrame), function(row) knit(file = "/Users/path/template.Rmd")) I get an error that the input ended unexpectedly.

Quitting from lines 23-26 (Preview-e0d353674d36.Rmd) 
Error in knit(file = "/Users/path/template.Rmd") : 
  unused argument (file = "/Users/path/template.Rmd")
Calls: <Anonymous> ... eval -> eval -> tapply -> lapply -> FUN -> FUN -> knit

Execution halted

I tried using the same awesome approach solved by @Yihui to programmatically reference text with knitr-expand detailed here: R knitr: Possible to programmatically modify chunk labels?

From that solution, we have two .Rmd files, My report and Template My report looks like:

# My report

```{r}
data(mtcars)
cyl.levels <- unique(mtcars$cyl)
```

## Generate report for each level of cylinder variable
```{r, include=FALSE}
src <- lapply(cyl.levels, function(ncyl) knit_expand(file = "template.Rmd"))
```

`r knit(text = unlist(src))`

Template looks like:

```{r, results='asis'}
cat("### {{ncyl}} cylinders")
```

```{r mpg-histogram-{{ncyl}}cyl}
hist(mtcars$mpg[mtcars$cyl == {{ncyl}}], 
  main = paste({{ncyl}}, "cylinders"))
```

```{r weight-histogam-{{ncyl}}cyl}
hist(mtcars$wt[mtcars$cyl == {{ncyl}}], 
  main = paste({{ncyl}}, "cylinders"))
```

This solution produces a single markdown document with a subdocument (at heading level 2) for each level of cylinder. However, I am trying to create a report that fetches a .csv and then creates and modifies a dataframe and produces content for each row of another dataframe.

What I think I am stuck on is how to use the value in {{ncyl}} to programmatically refer to rows of a database. I would like to be able to use the levels of {{ncyl}} to go and do stuff with the related rows in the dataframe mtcars (assuming that it only had rows == levels{{ncyl}} for this example).

While data(mtcars), does have more rows than levels of cylyinder, R stores the value of {{ncyl}} as an integer. So, you can call mtcars$gear[[{{ncyl}}]] and get the value of gear for the {{ncyl}} row.

Why then, when we add that into our template.Rmd, it fails?

Forgive me, it doesn't fail, it will give us gear <- mtcars$gear[[{{ncyl}}]] but we cannot then create a chunk of gear, like ```{r this-gear-{{gear}}}.

This works

```{r}
gear <- mtcars$gear[[{{ncyl}}]]
gear
```

```{r, results='asis'}
cat("### {{ncyl}} cylinders")
```

```{r mpg-histogram-{{ncyl}}cyl}
hist(mtcars$mpg[mtcars$cyl == {{ncyl}}], 
  main = paste({{ncyl}}, "cylinders"))
```

```{r weight-histogam-{{ncyl}}cyl}
hist(mtcars$wt[mtcars$cyl == {{ncyl}}], 
  main = paste({{ncyl}}, "cylinders"))
```

This does not work

```{r}
gear <- mtcars$gear[[{{ncyl}}]]
gear
```

```{r, results='asis'}
cat("### {{ncyl}} cylinders")
```

```{r mpg-histogram-{{ncyl}}cyl}
hist(mtcars$mpg[mtcars$cyl == {{ncyl}}], 
  main = paste({{ncyl}}, "cylinders"))
```

```{r weight-histogam-{{ncyl}}cyl}
hist(mtcars$wt[mtcars$cyl == {{ncyl}}], 
  main = paste({{ncyl}}, "cylinders"))
```
```{r {{gear}}}
gear
```

Giving the error

Quitting from lines 10-12 (Preview-e0d32d687661.Rmd) 
Error in eval(expr, envir, enclos) : object 'gear' not found
Calls: <Anonymous> ... knit_expand -> inline_exec -> withVisible -> eval -> eval
Execution halted

I think I am approaching the main problem "How do I create a markdown document for each row of a dataframe?" wrong with the knit-expand feature.

Can someone help me understand: 1. How to solve the main problem 2. Why the {{gear}} does not work within template.Rmd?

So, I still don't understand (2), but I think that @daroczig has gotten me close to understanding one way to solve the main problem. I don't think this is too unique of a problem, and I assume that there is a way to solve it without brew or pander or rapport. In any case, I took the brew approach and do something with a few lines of a dataframe. It throws an error. Note I am not doing anything sensible with this code, just limiting the mtcars to 3 rows so I don't get too much output, and then creating another, lame, dataframe within the for loop.

# My report

<%
mtcars1 <- mtcars[1:3,]
mtcars1$type <- c('red','blue','green')
t.levels <- unique(mtcars1$type)
for (ty in t.levels) {
p <- subset(mtcars1,type == ty) 
x <- rep(p, 4)
short <- paste0(p$gear, p$mpg)
%>

### <%= short %> blah

<%=
hist(x$mpg, main = paste(short, "blah"))
%>

<% } %>

This is just a little lame modification of the solution proposed below by @daroczig. It works if we name it demo.brew and call it from Pandoc.brew('demo.brew', output = tempfile(), convert = 'html'). Making one silly example.

(3) Is there an example of how to do this without brew? I'm curious.

Answer to (3) Yes. This works with a for loop that calls the variable instead of row num

varlist <- unique(df$variable)
for (var in varlist) {
    try(knit2html(input= '/Users/path/template.Rmd',
                  output=paste0('/Users/path/template',var,'.html'))) 

Works where the loop from 1:nrow() did not.

解决方案

An alternative solution with pander -- based on my above comment:

# My report

<%
cyl.levels <- unique(mtcars$cyl)
for (ncyl in cyl.levels) {
%>

### <%= ncyl %> cylinders

<%=
hist(mtcars$mpg[mtcars$cyl == ncyl], main = paste(ncyl, "cylinders"))
hist(mtcars$wt[mtcars$cyl == ncyl], main = paste(ncyl, "cylinders"))
%>

<% } %>

To brew this file (named as demo.brew), run:

Pandoc.brew('demo.brew')

Or to get e.g. a MS Word document:

Pandoc.brew('demo.brew', output = tempfile(), convert = 'docx')


Update: I've just realized that you need separate documents for the categories. For this end, I'd suggest my other package, rapport, a try, which focuses on exactly statistical report templates. Quick example:

<!--head
meta:
  title: Demo for @Jessi
  author: daroczig
  description: This is a demo
  packages: ~
inputs:
- name: ncyl
  class: integer
  standalone: TRUE
  required: TRUE
head-->

### <%= ncyl %> cylinders

<%=
hist(mtcars$mpg[mtcars$cyl == ncyl], main = paste(ncyl, "cylinders"))
hist(mtcars$wt[mtcars$cyl == ncyl], main = paste(ncyl, "cylinders"))
%>

So this above document (demo.rapport) is a rapport template, which has a YAML header for the metadata and inputs (which acts like parameters/arguments in R functions), then the body can include markdown and R code in brew syntax with pander. Now you can easily call this report template with a simple call, e.g. for 4 cylinders:

> rapport('demo.rapport', ncyl = 4)

### _4_ cylinders

![](plots/rapport--home-daroczig-projects-demo.rapport-6-1.png)
![](plots/rapport--home-daroczig-projects-demo.rapport-6-2.png)

And to produce a MS Word file for all cylinders, try this:

for (ncyl in (2:4)*2) {
    rapport.docx('/home/daroczig/projects/demo.rapport', ncyl = ncyl)
}

这篇关于如何为R中的数据帧的每一行生成降价文档的文章就介绍到这了,希望我们推荐的答案对大家有所帮助,也希望大家多多支持IT屋!

查看全文
登录 关闭
扫码关注1秒登录
发送“验证码”获取 | 15天全站免登陆