使用Knitr生成复杂的动态文档 [英] Using knitr to produce complex dynamic documents

查看:94
本文介绍了使用Knitr生成复杂的动态文档的处理方法,对大家解决问题具有一定的参考价值,需要的朋友们下面随着小编来一起学习吧!

问题描述

以下最小的可复制示例(RE)是我试图弄清楚如何使用knitr生成复杂动态文档,此处的复杂"是指不是针对文档的元素及其布局,而是针对底层R代码块的非线性逻辑.尽管提供的RE及其结果表明,基于这种方法的解决方案可能很好用,但我想知道:1)这是正确在这种情况下使用knitr的方法; 2)是否可以进行任何优化来改进该方法; 3)什么是替代方法,它可以降低代码块的粒度.

The minimal reproducible example (RE) below is my attempt to figure out how can I use knitr for generating complex dynamic documents, where "complex" here refers not to the document's elements and their layout, but to non-linear logic of the underlying R code chunks. While the provided RE and its results show that a solution, based on such approach might work well, I would like to know: 1) is this a correct approach of using knitr for such situations; 2) are there any optimizations that can be made to improve the approach; 3) what are alternative approaches, which could decrease the granularity of code chunks.

EDA源代码(文件"reEDA.R"):

## @knitr CleanEnv
rm(list = ls(all.names = TRUE))

## @knitr LoadPackages
library(psych)
library(ggplot2)

## @knitr PrepareData

set.seed(100) # for reproducibility
data(diamonds, package='ggplot2')  # use built-in data


## @knitr PerformEDA

generatePlot <- function (df, colName) {

  df <- df
  df$var <- df[[colName]]

  g <- ggplot(data.frame(df)) +
    scale_fill_continuous("Density", low="#56B1F7", high="#132B43") +
    scale_x_log10("Diamond Price [log10]") +
    scale_y_continuous("Density") +
    geom_histogram(aes(x = var, y = ..density..,
                       fill = ..density..),
                   binwidth = 0.01)
  return (g)
}

performEDA <- function (data) {

  d_var <- paste0("d_", deparse(substitute(data)))
  assign(d_var, describe(data), envir = .GlobalEnv)

  for (colName in names(data)) {
    if (is.numeric(data[[colName]]) || is.factor(data[[colName]])) {
      t_var <- paste0("t_", colName)
      assign(t_var, summary(data[[colName]]), envir = .GlobalEnv)

      g_var <- paste0("g_", colName)
      assign(g_var, generatePlot(data, colName), envir = .GlobalEnv)
    }
  }
}

performEDA(diamonds)

EDA报告R Markdown文档(文件"reEDA.Rmd"):

```{r KnitrSetup, echo=FALSE, include=FALSE}
library(knitr)
opts_knit$set(progress = TRUE, verbose = TRUE)
opts_chunk$set(
  echo = FALSE,
  include = FALSE,
  tidy = FALSE,
  warning = FALSE,
  comment=NA
)
```

```{r ReadChunksEDA, cache=FALSE}
read_chunk('reEDA.R')
```

```{r CleanEnv}
```

```{r LoadPackages}
```

```{r PrepareData}
```

Narrative: Data description

```{r PerformEDA}
```

Narrative: Intro to EDA results

Let's look at summary descriptive statistics for our dataset

```{r DescriptiveDataset, include=TRUE}
print(d_diamonds)
```

Now, let's examine each variable of interest individually.

Varible Price is ... Decriptive statistics for 'Price':

```{r DescriptivePrice, include=TRUE}
print(t_price)
```

Finally, let's examine price distribution across the dataset visually:

```{r VisualPrice, include=TRUE, fig.align='center'}
print(g_price)
```

结果可在此处找到:

http://rpubs.com/abrpubs/eda1

推荐答案

我不了解这段代码的非线性之处;也许是因为示例(顺便说一句感谢)足够小,无法演示代码,但又不足以演示问题.

I don't understand what's non-linear about this code; perhaps because the example (thanks for that by the way) is small enough to demonstrate the code but not large enough to demonstrate the concern.

尤其是,我不了解performEDA函数的原因.为什么不将该功能纳入减价计划呢?它看起来更简单,更清晰. (未经测试...)

In particular, I don't understand the reason for the performEDA function. Why not put that functionality into the markdown? It would seem to be simpler and clearer to read. (This is untested...)

Let's look at summary descriptive statistics for our dataset

```{r DescriptiveDataset, include=TRUE}
print(describe(diamonds))
```

Now, let's examine each variable of interest individually.

Varible Price is ... Decriptive statistics for 'Price':

```{r DescriptivePrice, include=TRUE}
print(summary(data[["Price"]]))
```

Finally, let's examine price distribution across the dataset visually:

```{r VisualPrice, include=TRUE, fig.align='center'}
print(generatePlot(data, "Price"))
```

看起来您将要显示所有变量的图;您是不是想在那儿循环?

It looked like you were going to show the plots for all the variables; are you perhaps looking to loop there?

此外,这不会改变功能,但是在R习惯用法之内,要让performEDA返回具有其创建的内容的列表,而不是分配给全局环境.我花了一段时间才弄清楚代码做了什么,因为这些新变量似乎没有在任何地方定义.

Also, this wouldn't change the functionality, but it would be much more within the R idiom to have performEDA return a list with the things it had created, rather than assigning into the global environment. It took me a while to figure out what the code did as those new variables didn't seem to be defined anywhere.

这篇关于使用Knitr生成复杂的动态文档的文章就介绍到这了,希望我们推荐的答案对大家有所帮助,也希望大家多多支持IT屋!

查看全文
登录 关闭
扫码关注1秒登录
发送“验证码”获取 | 15天全站免登陆