用ggplot固定填充密度图的不同部分 [英] fixed fill for different sections of a density plot with ggplot

查看:168
本文介绍了用ggplot固定填充密度图的不同部分的处理方法,对大家解决问题具有一定的参考价值,需要的朋友们下面随着小编来一起学习吧!

问题描述

鉴于从rnorm绘制,并且截断了c,我希望绘图使用以下颜色:

Given draws from a rnorm, and cutoff c I want my plot to use the following colors:

  1. 红色代表-c
  2. 左侧的区域
  3. -cc
  4. 之间的部分为蓝色
  5. c
  6. 右侧的部分为绿色
  1. Red for the section that is to the left of -c
  2. Blue for the section in between -c and c
  3. and Green for the section that is to the right of c

例如,如果我的数据是:

For example, if my data is:

set.seed(9782)
mydata <- rnorm(1000, 0, 2)
c <- 1

我想绘制这样的东西:

但是,如果我的数据全部位于c的右侧,则整个图应为绿色.同样,如果所有元素都位于-cc之间或-c左侧,则图应全部为红色或蓝色.

But if my data is all to the right of c the whole plot should be green. Similarly, if all is between -c and c or to the left of -c the plot should be all red or blue.

这是我写的代码:

MinD <- min(mydata)
MaxD <- max(mydata)

df.plot <- data.frame(density = mydata)

if(c==0){
  case <- dplyr::case_when((MinD < 0 & MaxD >0) ~ "L_and_R",
                           (MinD > 0) ~ "R",
                           (MaxD < 0) ~ "L")
}else{
  case <- dplyr::case_when((MinD < -c & MaxD >c) ~ "ALL",
                           (MinD > -c & MaxD > c) ~ "Center_and_R",
                           (MinD > -c & MaxD <c) ~ "Center",
                           (MinD < -c & MaxD < c) ~ "Center_and_L",
                           MaxD < -c ~ "L",
                           MaxD > c ~ "R")
}

# Draw the Center

if(case %in% c("ALL", "Center_and_R", "Center", "Center_and_L")){
  ds <- density(df.plot$density, from = -c, to = c)
  ds_data_Center <- data.frame(x = ds$x, y = ds$y, section="Center")
} else{
  ds_data_Center <- data.frame(x = NA, y = NA, section="Center")
}

# Draw L

if(case %in% c("ALL", "Center_and_L", "L", "L_and_R")){
  ds <- density(df.plot$density, from = MinD, to = -c)
  ds_data_L <- data.frame(x = ds$x, y = ds$y, section="L")
} else{
  ds_data_L <- data.frame(x = NA, y = NA, section="L")
}

# Draw R

if(case %in% c("ALL", "Center_and_R", "R", "L_and_R")){
  ds <- density(df.plot$density, from = c, to = MaxD)
  ds_data_R <- data.frame(x = ds$x, y = ds$y, section="R")
} else{
  ds_data_R <- data.frame(x = NA, y = NA, section="R")
}

L_Pr <- round(mean(mydata < -c),2)
Center_Pr <- round(mean((mydata>-c & mydata<c)),2)
R_Pr <- round(mean(mydata > c),2)

filldf <- data.frame(section = c("L", "Center", "R"), 
                     Pr = c(L_Pr, Center_Pr, R_Pr), 
                     fill = c("red", "blue", "green")) %>% 
  dplyr::mutate(section = as.character(section))


if(c==0){
  ds_data <- suppressWarnings(dplyr::bind_rows(ds_data_L, ds_data_R)) %>% 
    dplyr::full_join(filldf, by = "section") %>% filter(Pr!=0) %>% 
    dplyr::full_join(filldf, by = "section") %>% mutate(section = ordered(section, levels=c("L","R"))) 
  ds_data <- ds_data[order(ds_data$section), ] %>%  
    filter(Pr!=0) %>% 
    mutate(Pr=scales::percent(Pr))
}else{
  ds_data <- suppressWarnings(dplyr::bind_rows(ds_data_Center, ds_data_L, ds_data_R)) %>% 
    dplyr::full_join(filldf, by = "section") %>% mutate(section = ordered(section, levels=c("L","Center","R"))) 
  ds_data <- ds_data[order(ds_data$section), ] %>%  
    filter(Pr!=0) %>% 
    mutate(Pr=scales::percent(Pr))
}

fillScale <- scale_fill_manual(name = paste0("c = ", c, ":"),
                               values = as.character(unique(ds_data$fill)))

p <- ggplot(data = ds_data, aes(x=x, y=y, fill=Pr)) + 
  geom_area() + fillScale 

A,我无法弄清楚如何将颜色分配给不同的部分,同时将百分比保留为颜色的标签.

Alas, I cannot figure out how to assign the colors to the different sections while keeping the percentages as labels for the colors.

推荐答案

我们使用density函数创建将实际绘制的数据框.然后,我们使用cut函数使用数据值的范围来创建组.最后,我们计算每个组的概率质量,并将其用作实际的图例标签.

We use the density function to create the data frame we'll actually plot. Then, We use the cut function to create groups using ranges of the data values. Finally, we calculate the probability mass for each group and use those as the actual legend labels.

我们还创建了一个带颜色的标记矢量,以确保相同的颜色始终在给定的x值范围内进行,无论数据是否包含在给定的x值范围内的任何值.

We also create a labeled vector of colors to ensure that the same color always goes with a given range of x-values, regardless of whether the data contains any values within a given range of x-values.

下面的代码将所有这些打包到一个函数中.

The code below packages all this into a function.

library(tidyverse)
library(gridExtra)

fill_density = function(x, cc=1, adj=1, drop_levs=FALSE) {

  # Calculate density values for input data
  dens = data.frame(density(x, n=2^10, adjust=adj)[c("x","y")]) %>% 
    mutate(section = cut(x, breaks=c(-Inf, -1, cc, Inf))) %>% 
    group_by(section) %>% 
    mutate(prob = paste0(round(sum(y)*mean(diff(x))*100),"%"))

  # Get probability mass for each level of section
  # We'll use these as the label values in scale_fill_manual
  sp = dens %>% 
    group_by(section, prob) %>% 
    summarise %>% 
    ungroup

  if(!drop_levs) {
   sp = sp %>% complete(section, fill=list(prob="0%"))
  }

  # Assign colors to each level of section
  col = setNames(c("red","blue","green"), levels(dens$section))

  ggplot(dens, aes(x, y, fill=section)) +
    geom_area() +
    scale_fill_manual(labels=sp$prob, values=col, drop=drop_levs) +
    labs(fill="")
}

现在让我们在几种不同的数据分布上运行该函数:

Now let's run the function on several different data distributions:

set.seed(3)
dat2 = rnorm(1000)
grid.arrange(fill_density(mydata), fill_density(mydata[mydata>0]),
             fill_density(mydata[mydata>2], drop_levs=TRUE), 
              fill_density(mydata[mydata>2], drop_levs=FALSE),
             fill_density(mydata[mydata < -5 | mydata > 5], adj=0.3), fill_density(dat2),
             ncol=2)

这篇关于用ggplot固定填充密度图的不同部分的文章就介绍到这了,希望我们推荐的答案对大家有所帮助,也希望大家多多支持IT屋!

查看全文
登录 关闭
扫码关注1秒登录
发送“验证码”获取 | 15天全站免登陆