从R中的循环内将ggplot对象存储在列表中 [英] Storing ggplot objects in a list from within loop in R

查看:29
本文介绍了从R中的循环内将ggplot对象存储在列表中的处理方法,对大家解决问题具有一定的参考价值,需要的朋友们下面随着小编来一起学习吧!

问题描述

我的问题类似于 这个;当我在循环中生成绘图对象(在本例中为直方图)时,似乎所有这些对象都被最新的绘图覆盖了.

My problem is similar to this one; when I generate plot objects (in this case histograms) in a loop, seems that all of them become overwritten by the most recent plot.

为了调试,在循环中,我打印了索引和生成的图,两者都正确显示.但是当我查看存储在列表中的图时,它们都是相同的 except 标签.

To debug, within the loop, I am printing the index and the generated plot, both of which appear correctly. But when I look at the plots stored in the list, they are all identical except for the label.

(我正在使用 multiplot 来制作合成图像,但是如果您 print (myplots[[1]]) 会得到相同的结果通过 print(myplots[[4]]) 一次一个.)

(I'm using multiplot to make a composite image, but you get same outcome if you print (myplots[[1]]) through print(myplots[[4]]) one at a time.)

因为我已经有一个附加的数据框(不像类似问题的海报),我不知道如何解决这个问题.

Because I already have an attached dataframe (unlike the poster of the similar problem), I am not sure how to solve the problem.

(顺便说一句,列类是我在这里近似的原始数据集中的因素,但如果它们是整数,也会出现同样的问题)

(btw, column classes are factor in the original dataset I am approximating here, but same problem occurs if they are integer)

这是一个可重现的例子:

Here is a reproducible example:

library(ggplot2)
source("http://peterhaschke.com/Code/multiplot.R") #load multiplot function

#make sample data
col1 <- c(2, 4, 1, 2, 5, 1, 2, 0, 1, 4, 4, 3, 5, 2, 4, 3, 3, 6, 5, 3, 6, 4, 3, 4, 4, 3, 4, 
          2, 4, 3, 3, 5, 3, 5, 5, 0, 0, 3, 3, 6, 5, 4, 4, 1, 3, 3, 2, 0, 5, 3, 6, 6, 2, 3, 
          3, 1, 5, 3, 4, 6)
col2 <- c(2, 4, 4, 0, 4, 4, 4, 4, 1, 4, 4, 3, 5, 0, 4, 5, 3, 6, 5, 3, 6, 4, 4, 2, 4, 4, 4, 
          1, 1, 2, 2, 3, 3, 5, 0, 3, 4, 2, 4, 5, 5, 4, 4, 2, 3, 5, 2, 6, 5, 2, 4, 6, 3, 3, 
          3, 1, 4, 3, 5, 4)
col3 <- c(2, 5, 4, 1, 4, 2, 3, 0, 1, 3, 4, 2, 5, 1, 4, 3, 4, 6, 3, 4, 6, 4, 1, 3, 5, 4, 3, 
          2, 1, 3, 2, 2, 2, 4, 0, 1, 4, 4, 3, 5, 3, 2, 5, 2, 3, 3, 4, 2, 4, 2, 4, 5, 1, 3, 
          3, 3, 4, 3, 5, 4)
col4 <- c(2, 5, 2, 1, 4, 1, 3, 4, 1, 3, 5, 2, 4, 3, 5, 3, 4, 6, 3, 4, 6, 4, 3, 2, 5, 5, 4,
          2, 3, 2, 2, 3, 3, 4, 0, 1, 4, 3, 3, 5, 4, 4, 4, 3, 3, 5, 4, 3, 5, 3, 6, 6, 4, 2, 
          3, 3, 4, 4, 4, 6)
data2 <- data.frame(col1,col2,col3,col4)
data2[,1:4] <- lapply(data2[,1:4], as.factor)
colnames(data2)<- c("A","B","C", "D")

#generate plots
myplots <- list()  # new empty list
for (i in 1:4) {
  p1 <- ggplot(data=data.frame(data2),aes(x=data2[ ,i]))+ 
    geom_histogram(fill="lightgreen") +
    xlab(colnames(data2)[ i])
  print(i)
  print(p1)
  myplots[[i]] <- p1  # add each plot into plot list
}
multiplot(plotlist = myplots, cols = 4)

当我在绘图列表中查看绘图对象的摘要时,我看到的是这样的

When I look at a summary of a plot object in the plot list, this is what I see

> summary(myplots[[1]])
data: A, B, C, D [60x4]
mapping:  x = data2[, i]
faceting: facet_null() 
-----------------------------------
geom_histogram: fill = lightgreen 
stat_bin:  
position_stack: (width = NULL, height = NULL)

我认为 mapping: x = data2[, i] 是问题所在,但我很难过!我无法发布图片,因此如果我对问题的解释令人困惑,您需要运行我的示例并查看图表.

I think that mapping: x = data2[, i] is the problem, but I am stumped! I can't post images, so you'll need to run my example and look at the graphs if my explanation of the problem is confusing.

谢谢!

推荐答案

除了其他出色的答案,这里有一个使用正常"评估而不是 eval 的解决方案.由于 for 循环没有单独的变量范围(即它们在当前环境中执行),我们需要使用 local 来包装 for 块;此外,我们需要使 i 成为一个局部变量——我们可以通过将其重新分配给它自己的名称来做到这一点1:

In addition to the other excellent answer, here’s a solution that uses "normal"-looking evaluation rather than eval. Since for loops have no separate variable scope (i.e. they are performed in the current environment) we need to use local to wrap the for block; in addition, we need to make i a local variable — which we can do by re-assigning it to its own name1:

myplots <- vector('list', ncol(data2))

for (i in seq_along(data2)) {
    message(i)
    myplots[[i]] <- local({
        i <- i
        p1 <- ggplot(data2, aes(x = data2[[i]])) +
            geom_histogram(fill = "lightgreen") +
            xlab(colnames(data2)[i])
        print(p1)
    })
}

然而,一种更简洁的方法是完全放弃 for 循环并使用列表函数来构建结果.这有几种可能的方式.以下是我认为最简单的:

However, an altogether cleaner way is to forego the for loop entirely and use list functions to build the result. This works in several possible ways. The following is the easiest in my opinion:

plot_data_column = function (data, column) {
    ggplot(data, aes_string(x = column)) +
        geom_histogram(fill = "lightgreen") +
        xlab(column)
}

myplots <- lapply(colnames(data2), plot_data_column, data = data2)

这有几个优点:它更简单,并且不会使环境混乱(使用循环变量 i).

This has several advantages: it’s simpler, and it won’t clutter the environment (with the loop variable i).

1 这可能看起来令人困惑:为什么 i <-i 有任何效果?— 因为通过执行赋值,我们创建了一个新的 local 变量,它与外部作用域中的变量同名.我们同样可以使用不同的名称,例如local_i <- i.

1 This might seem confusing: why does i <- i have any effect at all? — Because by performing the assignment we create a new, local variable with the same name as the variable in the outer scope. We could equally have used a different name, e.g. local_i <- i.

这篇关于从R中的循环内将ggplot对象存储在列表中的文章就介绍到这了,希望我们推荐的答案对大家有所帮助,也希望大家多多支持IT屋!

查看全文
登录 关闭
扫码关注1秒登录
发送“验证码”获取 | 15天全站免登陆