基数 r 中的甘特图 - 修改绘图属性 [英] Gantt plot in base r - modifying plot properties

查看:49
本文介绍了基数 r 中的甘特图 - 修改绘图属性的处理方法,对大家解决问题具有一定的参考价值,需要的朋友们下面随着小编来一起学习吧!

问题描述

我想问一个与这篇文章中给出的答案相关的后续问题[甘特式时间线图(以 R 为基数) ] 在以 r 为基数的甘特图上.我觉得这值得一个新问题,因为我认为这些情节具有广泛的吸引力.我也希望一个新问题能引起更多关注.我也觉得我需要更多的空间而不是该问题的评论来具体说明.

I would like to ask a follow-up question related to the answer given in this post [Gantt style time line plot (in base R) ] on Gantt plots in base r. I feel like this is worth a new question as I think these plots have a broad appeal. I'm also hoping that a new question would attract more attention. I also feel like I need more space than the comments of that question to be specific.

以下代码由@digEmAll 提供.它需要一个数据框,其中包含表示开始时间、结束时间和分组变量的列,并将其转换为甘特图.我稍微修改了 @digEmAll 的函数,以使甘特图中的条/段彼此连续而不是有间隙.这是:

The following code was given by @digEmAll . It takes a dataframe with columns referring to a start time, end time, and grouping variable and turns that into a Gantt plot. I have modified @digEmAll 's function very slightly to get the bars/segments in the Gantt plot to be contiguous to one another rather than having a gap. Here it is:

plotGantt <- function(data, res.col='resources', 
                      start.col='start', end.col='end', res.colors=rainbow(30))
{
  #slightly enlarge Y axis margin to make space for labels
  op <- par('mar')
  par(mar = op + c(0,1.2,0,0)) 

  minval <- min(data[,start.col])
  maxval <- max(data[,end.col])

  res.colors <- rev(res.colors)
 resources <- sort(unique(data[,res.col]),decreasing=T)


  plot(c(minval,maxval),
       c(0.5,length(resources)+0.5),
       type='n', xlab='Duration',ylab=NA,yaxt='n' )
  axis(side=2,at=1:length(resources),labels=resources,las=1)
  for(i in 1:length(resources))
  {
    yTop <- i+0.5
    yBottom <- i-0.5
    subset <- data[data[,res.col] == resources[i],]
    for(r in 1:nrow(subset))
    {
      color <- res.colors[((i-1)%%length(res.colors))+1]
      start <- subset[r,start.col]
      end <- subset[r,end.col]
      rect(start,yBottom,end,yTop,col=color)
    }
  }
  par(op) # reset the plotting margins
}

以下是一些示例数据.你会注意到我有四组 1-4.但是,并非所有数据帧都具有所有四组.有的只有两个,有的只有三个.

Here are some sample data. You will notice that I have four groups 1-4. However, not all dataframes have all four groups. Some only have two, some only have 3.

mydf1 <- data.frame(startyear=2000:2009, endyear=2001:2010, group=c(1,1,1,1,2,2,2,1,1,1))
mydf2 <- data.frame(startyear=2000:2009, endyear=2001:2010, group=c(1,1,2,2,3,4,3,2,1,1))
mydf3 <- data.frame(startyear=2000:2009, endyear=2001:2010, group=c(4,4,4,4,4,4,3,2,3,3))
mydf4 <- data.frame(startyear=2000:2009, endyear=2001:2010, group=c(1,1,1,2,3,3,3,2,1,1))

这里我运行了上面的函数,但是指定了四种绘图颜色:

Here I run the above function, but specify four colors for plotting:

plotGantt(mydf1, res.col='group', start.col='startyear', end.col='endyear', 
          res.colors=c('red','orange','yellow','gray99'))

plotGantt(mydf2, res.col='group', start.col='startyear', end.col='endyear', 
          res.colors=c('red','orange','yellow','gray99'))

plotGantt(mydf3, res.col='group', start.col='startyear', end.col='endyear', 
          res.colors=c('red','orange','yellow','gray99'))

plotGantt(mydf4, res.col='group', start.col='startyear', end.col='endyear', 
          res.colors=c('red','orange','yellow','gray99'))

这些是情节:

我想做的是修改函数,以便:

What I would like to do is modify the function so that:

1) 它将在 y 轴上绘制所有四组,而不管它们是否实际出现在数据中.

1) it will plot on the y-axis all four groups regardless of whether they actually appear in the data or not.

2) 无论有多少组,每个图的每个组都具有相同的颜色.如您所见,mydf2 有四组,并且绘制了所有四种颜色(1-红色、2-橙色、3-黄色、4-灰色).这些颜色实际上是用 mydf3 的相同组绘制的,因为它只包含组 2、3、4,并且颜色以相反的顺序选取.但是 mydf1 和 mydf4 为每个组绘制了不同的颜色,因为它们没有任何组 4.灰色仍然是选择的第一种颜色,但现在用于出现次数最少的组(mydf1 中的 group2 和 mydf3 中的 group3).

2) Have the same color associated with each group for every plot regardless of how many groups there are. As you can see, mydf2 has four groups and all four colors are plotted (1-red, 2-orange, 3-yellow, 4-gray). These colors are actually plotted with the same groups for mydf3 as that only contains groups 2,3,4 and the colors are picked in reverse order. However mydf1 and mydf4 have different colors plotted for each group as they do not have any group 4's. Gray is still the first color chosen but now it is used for the lowest occurring group (group2 in mydf1 and group3 in mydf3).

在我看来,我需要处理的主要事情是函数内部的向量资源",它不仅包含唯一的组,而且包含所有.当我尝试手动覆盖以确保它包含所有组时,例如做一些像 resources <-as.factor(1:4) 这样简单的事情,然后我得到一个错误:

It appears to me that the main thing I need to work on is the vector 'resources' inside the function, and have that not just contain the unique groups but all. When I try manually overriding to make sure it contains all the groups, e.g. doing something as simple as resources <-as.factor(1:4) then I get an error:

'Error in rect(start, yBottom, end, yTop, col = color) : cannot mix zero-length and non-zero-    length coordinates' 

大概 for 循环不知道如何为不存在的组绘制不存在的数据.

Presumably the for loop does not know how to plot data that do not exist for groups that don't exist.

我希望这是一个可复制/可读的问题,并且很清楚我想要做什么.

I hope that this is a replicable/readable question and it's clear what I'm trying to do.

我意识到要解决颜色问题,我可以为每个示例 dfs 中存在的 3 个组指定颜色.但是,我的目的是将此图用作函数的输出,如果特定 df 的所有组都存在,则不会提前知道.

I realize that to solve the color problem, I could just specify the colors for the 3 groups that exist in each of these sample dfs. However, my intention is to use this plot as an output to a function whereby it wouldn't be known ahead of time if all of the groups exist for a particular df.

推荐答案

我稍微修改了你的函数以在开始和结束日期考虑 NA :

I slightly modified your function to account for NA in start and end dates :

plotGantt <- function(data, res.col='resources', 
                      start.col='start', end.col='end', res.colors=rainbow(30))
{
  #slightly enlarge Y axis margin to make space for labels
  op <- par('mar')
  par(mar = op + c(0,1.2,0,0)) 

  minval <- min(data[,start.col],na.rm=T)
  maxval <- max(data[,end.col],na.rm=T)

  res.colors <- rev(res.colors)
  resources <- sort(unique(data[,res.col]),decreasing=T)


  plot(c(minval,maxval),
       c(0.5,length(resources)+0.5),
       type='n', xlab='Duration',ylab=NA,yaxt='n' )
  axis(side=2,at=1:length(resources),labels=resources,las=1)
  for(i in 1:length(resources))
  {
    yTop <- i+0.5
    yBottom <- i-0.5
    subset <- data[data[,res.col] == resources[i],]
    for(r in 1:nrow(subset))
    {
      color <- res.colors[((i-1)%%length(res.colors))+1]
      start <- subset[r,start.col]
      end <- subset[r,end.col]
      rect(start,yBottom,end,yTop,col=color)
    }
  }
  par(mar=op) # reset the plotting margins
  invisible()
}

这样,如果您只是将所有可能的组值附加到您的数据中,您就会将它们打印在 y 轴上.例如:

In this way, if you simply append all your possible group values to your data you'll get them printed on the y axis. e.g. :

mydf1 <- data.frame(startyear=2000:2009, endyear=2001:2010,
                    group=c(1,1,1,1,2,2,2,1,1,1))
# add all the group values you want to print with NA dates
mydf1 <- rbind(mydf1,data.frame(startyear=NA,endyear=NA,group=1:4))

plotGantt(mydf1, res.col='group', start.col='startyear', end.col='endyear', 
          res.colors=c('red','orange','yellow','gray99'))

关于颜色,目前将有序的res.colors应用于已排序的组;所以 res.colors 中的第一种颜色应用于第一个(已排序)组,依此类推...

About the colors, at the moment the ordered res.colors are applied to the sorted groups; so the 1st color in res.colors is applied to 1st (sorted) group and so on...

这篇关于基数 r 中的甘特图 - 修改绘图属性的文章就介绍到这了,希望我们推荐的答案对大家有所帮助,也希望大家多多支持IT屋!

查看全文
登录 关闭
扫码关注1秒登录
发送“验证码”获取 | 15天全站免登陆