ggplot2子设置后y轴顺序更改 [英] ggplot2 y-axis order changes after subsetting

查看:1080
本文介绍了ggplot2子设置后y轴顺序更改的处理方法,对大家解决问题具有一定的参考价值,需要的朋友们下面随着小编来一起学习吧!

问题描述

我有一个可以正常工作的功能,直到我将其子集化.函数plotCalendar()是我尝试使用带有构面的ggplot2进行的日历热图. y轴顺序很重要,因为它用于"WeekOfMonth"-当顺序反转时,数据即看起来不像日历.

I have a function that works as expected until I subset it. The function, plotCalendar() is my attempt at a Calendar Heat Map using ggplot2 with facets. The y-axis order is important because it is for the "WeekOfMonth" - when the order is reversed the data viz does not look like a calendar.

下面是代码,首先是调用代码,然后是生成一些数据的函数-generateData(),然后是绘图函数-plotCalendar()

The code is below, first the calling code, then the function to generate some data - generateData(), then the plot function - plotCalendar()

当我使用df作为数据时,代码按预期工作,但是当我使用df2(子集数据)时,WeekOfMonth的顺序沿y轴反转.

The code works as expected when I used df for the data but when I used df2, the subsetted data, the order of the WeekOfMonth is reversed along the y-axis.

library(ggplot2)
library(ProgGUIinR)
library(chron)

df <- generateData()
plotCalendar(df, dateFieldName = "dates", numericFieldName = "counts", yLab = "Month of Year")
df2 <- df[df$filterField == 42, ]
plotCalendar(df2, dateFieldName = "dates", numericFieldName = "counts", yLab = "Month of Year")

这两个函数,一个用于生成测试数据,另一个用于绘制日历

generateData <- function()
{
      set.seed(42)
      dates <- seq(as.Date("2012/01/01"), as.Date("2012/6/30"), by = "1 day")
      counts <- 1:length(dates)
      filterField <- sample(1:42,length(dates),replace=T)
      df <- data.frame(dates, counts, filterField)

      return(df)
}


plotCalendar <- function(data, dateFieldName, numericFieldName, title = "Title", yLab = "Y Label", fillLab = "Fill Label", lowColor = "moccasin", highColor = "dodgerblue")
{

      agg <- aggregate(as.formula(paste(numericFieldName, "~", dateFieldName)), data, sum)

      names(agg)[names(agg) == dateFieldName] <- "DateField"
      names(agg)[names(agg) == numericFieldName] <- "NumericField"

      minMonth <- as.POSIXlt(min(agg$DateField))$mon + 1
      maxMonth <- as.POSIXlt(max(agg$DateField))$mon + 1

      minYear <- as.POSIXlt(min(agg$DateField))$year + 1900
      maxYear <- as.POSIXlt(max(agg$DateField))$year + 1900 

      minDate <- ISOdate(minYear, minMonth, 1)
      maxDate <- ISOdate(maxYear, maxMonth, 1)
      maxDateEndMonth <- as.POSIXlt(as.Date(seq(maxDate, length = 2, by = "1 month")[2]))
      daySeq <- seq(minDate, maxDateEndMonth, by = "1 day")

      daySeq <- as.data.frame(daySeq)
      names(daySeq) <- c("DateField")
      daySeq$DateField <- as.Date(daySeq$DateField)
      agg$DateField <- as.Date(agg$DateField)

      agg <- merge(daySeq, agg, by = "DateField", all.x = T)

      agg$Day <- as.numeric(days(agg$DateField))

      agg$Weekday <- weekdays(agg$DateField)
      agg$Weekday <- factor(agg$Weekday, levels = rev(c("Saturday", "Friday", "Thursday", "Wednesday", "Tuesday", "Monday", "Sunday")))

      agg$Month <- months(agg$DateField)
      agg$Month <- factor(agg$Month, levels = c("January", "February", "March", "April", "May", "June", "July", "August", "September", "October", "November", "December"))

      agg$MonthNumber <- as.POSIXlt(agg$DateField)$mon + 1

      agg$Year <-  as.POSIXlt(agg$DateField)$year + 1900

      agg$WeekOfMonth <- 1 + week.of.month(agg$Year, agg$MonthNumber, agg$Day)
      agg$WeekOfMonth <- factor(agg$WeekOfMonth, levels = 6:1)

      #makeSpreadsheet(gActs, "Group Activities - Member Participation")

      View(agg)
      p <- ggplot(agg)
      p <- p + aes(Year, WeekOfMonth, fill = NumericField)

      noData <- subset(agg, is.na(agg$NumericField))

      p <- p + geom_tile(data = subset(agg, !is.na(agg$NumericField)), aes(fill = NumericField), color = "gray")
      if(nrow(noData) > 0)
      {
        p <- p + geom_tile(data = noData, color = "gray", fill = "white")
      }

      p <- p + geom_text(aes(label = paste(paste(rep(" ", 5), collapse = ""), Day)), vjust = 0, size = 3, colour = "black")
      p <- p + geom_text(data = subset(agg, !is.na(NumericField)), aes(label = NumericField), size = 4, vjust = 0.5, hjust = 1, color = 'black', fontface = "bold")
      p <- p + facet_grid(Month ~ Weekday) + scale_fill_gradient(low = lowColor, high = highColor)
      p <- p + labs(title = paste(title, "\n"), y = paste(yLab, "\n"), fill = fillLab)
      p <- p + theme(plot.title = element_text(size = 20, face="bold"),  
                     axis.title.x = element_blank(), 
                     axis.ticks.x = element_blank(),
                     axis.text.x = element_blank(),
                     axis.title.y = element_text(size = 16, face = "bold"), 
                     legend.title = element_text(size = 14, face = "bold"), 
                     legend.text = element_text(size = 11),
                     panel.grid.major = element_blank(),
                     panel.grid.minor = element_blank(),
                     strip.text = element_text(size = 14, face = "bold"))
      plot(p)
}

谢谢

保罗

推荐答案

如果反转到图层的顺序,则可以.

If you reverse the order of the to tile layers, it works.

当前:

p <- ggplot(agg, aes(Year, WeekOfMonth, fill = NumericField))
noData <- subset(agg, is.na(agg$NumericField))
p <- p + geom_tile(data = subset(agg, !is.na(agg$NumericField)), aes(fill = NumericField), color = "gray")
if(nrow(noData) > 0) p <- p + geom_tile(data = noData, color = "gray", fill = "white")

新功能:

p <- ggplot(agg,aes(Year, WeekOfMonth, fill = NumericField))  
noData <- subset(agg, is.na(agg$NumericField)) 
if(nrow(noData) > 0) p <- p + geom_tile(data = noData, color = "gray", fill = "white")
p <- p + geom_tile(data = subset(agg, !is.na(agg$NumericField)), aes(fill = NumericField), color = "gray")

我认为问题在于ggplot对缺少水平的因素(例如agg$WeekOfMonth)的处理.解决此问题的一种方法是避免使agg$WeekOfMonth成为一个因素.

I think the problem is to do with ggplot's treatment of factors,e.g., agg$WeekOfMonth, that have missing levels. One way around this is to avoid making agg$WeekOfMonth a factor.

agg$WeekOfMonth <- 1 + week.of.month(agg$Year, agg$MonthNumber, agg$Day)
p <- ggplot(agg)
p <- p + aes(Year, -WeekOfMonth, fill = NumericField)  
noData <- subset(agg, is.na(agg$NumericField))
p <- p + geom_tile(data = subset(agg, !is.na(agg$NumericField)), aes(fill = NumericField), color = "gray")
if(nrow(noData) > 0)p <- p + geom_tile(data = noData, color = "gray", fill = "white")

要避免出现负的y轴标签,您必须添加:

To avoid negative y-axis labels, you have to add:

p <- p + scale_y_continuous(label=abs)

ggplot层定义.这将产生与上述相同的图,并且不需要反转图块层的顺序.

to the ggplot layer definitions. This produces the same plot as above, and does not require reversing the order of the tile layers.

编辑发现了一种更好的方法.

通过使用scale_fill_continuous(...)na.value-...参数,您可以完全避免使用多个数据集.

By using the na.value-... argument to scale_fill_continuous(...) you can avoid multiple datasets completely.

p <- ggplot(agg)
p <- p + aes(Year, WeekOfMonth, fill = NumericField)
p <- p + geom_tile(aes(fill = NumericField), color = "gray")
p <- p + scale_fill_gradient(low = lowColor, high = highColor, na.value="white")

这完全避免了使用noData的情况.

This avoids the need for noData altogether.

最后,我想您有理由以这种方式显示日历,但是IMO在这里是一种更直观的日历视图.

Finally, I suppose you have a reason for displaying the calendars this way, but IMO here is a more intuitive calendar view.

gg.calendar <- function(df) {
  require(ggplot2)
  require(lubridate)
  wom <- function(date) { # week-of-month
    first <- wday(as.Date(paste(year(date),month(date),1,sep="-")))
    return((mday(date)+(first-2)) %/% 7+1)
  }
  df$month <- month(df$dates)
  df$day   <- mday(df$dates)

  rng   <- range(df$dates)
  rng   <- as.Date(paste(year(rng),month(rng),1,sep="-"))
  start <- rng[1]
  end   <- rng[2]
  month(end) <- month(end)+1
  day(end)   <- day(end)  -1

  cal <- data.frame(dates=seq(start,end,by="day"))
  cal$year  <- year(cal$dates)
  cal$month <- month(cal$dates)
  cal$cmonth<- month(cal$dates,label=T)
  cal$day   <- mday(cal$dates)
  cal$cdow  <- wday(cal$dates,label=T)
  cal$dow   <- wday(cal$dates)
  cal$week  <- wom(cal$dates)

  cal        <- merge(cal,df[,c("dates","counts")],all.x=T)

  ggplot(cal, aes(x=cdow,y=-week))+
    geom_tile(aes(fill=counts,colour="grey50"))+
    geom_text(aes(label=day),size=3,colour="grey20")+
    facet_wrap(~cmonth, ncol=3)+
    scale_fill_gradient(low = "moccasin", high = "dodgerblue", na.value="white")+
    scale_color_manual(guide=F,values="grey50")+
    scale_x_discrete(labels=c("S","M","T","W","Th","F","S"))+
    theme(axis.text.y=element_blank(),axis.ticks.y=element_blank())+
    theme(panel.grid=element_blank())+
    labs(x="",y="")+
    coord_fixed()
}
gg.calendar(df)
gg.calendar(df2)

这篇关于ggplot2子设置后y轴顺序更改的文章就介绍到这了,希望我们推荐的答案对大家有所帮助,也希望大家多多支持IT屋!

查看全文
登录 关闭
扫码关注1秒登录
发送“验证码”获取 | 15天全站免登陆