用ggplot固定填充密度图的不同部分 [英] fixed fill for different sections of a density plot with ggplot
问题描述
鉴于从rnorm
绘制,并且截断了c
,我希望绘图使用以下颜色:
Given draws from a rnorm
, and cutoff c
I want my plot to use the following colors:
- 红色代表
-c
左侧的区域
-
-c
和c
之间的部分为蓝色
- ,
c
右侧的部分为绿色
- Red for the section that is to the left of
-c
- Blue for the section in between
-c
andc
- and Green for the section that is to the right of
c
例如,如果我的数据是:
For example, if my data is:
set.seed(9782)
mydata <- rnorm(1000, 0, 2)
c <- 1
我想绘制这样的东西:
但是,如果我的数据全部位于c
的右侧,则整个图应为绿色.同样,如果所有元素都位于-c
和c
之间或-c
左侧,则图应全部为红色或蓝色.
But if my data is all to the right of c
the whole plot should be green. Similarly, if all is between -c
and c
or to the left of -c
the plot should be all red or blue.
这是我写的代码:
MinD <- min(mydata)
MaxD <- max(mydata)
df.plot <- data.frame(density = mydata)
if(c==0){
case <- dplyr::case_when((MinD < 0 & MaxD >0) ~ "L_and_R",
(MinD > 0) ~ "R",
(MaxD < 0) ~ "L")
}else{
case <- dplyr::case_when((MinD < -c & MaxD >c) ~ "ALL",
(MinD > -c & MaxD > c) ~ "Center_and_R",
(MinD > -c & MaxD <c) ~ "Center",
(MinD < -c & MaxD < c) ~ "Center_and_L",
MaxD < -c ~ "L",
MaxD > c ~ "R")
}
# Draw the Center
if(case %in% c("ALL", "Center_and_R", "Center", "Center_and_L")){
ds <- density(df.plot$density, from = -c, to = c)
ds_data_Center <- data.frame(x = ds$x, y = ds$y, section="Center")
} else{
ds_data_Center <- data.frame(x = NA, y = NA, section="Center")
}
# Draw L
if(case %in% c("ALL", "Center_and_L", "L", "L_and_R")){
ds <- density(df.plot$density, from = MinD, to = -c)
ds_data_L <- data.frame(x = ds$x, y = ds$y, section="L")
} else{
ds_data_L <- data.frame(x = NA, y = NA, section="L")
}
# Draw R
if(case %in% c("ALL", "Center_and_R", "R", "L_and_R")){
ds <- density(df.plot$density, from = c, to = MaxD)
ds_data_R <- data.frame(x = ds$x, y = ds$y, section="R")
} else{
ds_data_R <- data.frame(x = NA, y = NA, section="R")
}
L_Pr <- round(mean(mydata < -c),2)
Center_Pr <- round(mean((mydata>-c & mydata<c)),2)
R_Pr <- round(mean(mydata > c),2)
filldf <- data.frame(section = c("L", "Center", "R"),
Pr = c(L_Pr, Center_Pr, R_Pr),
fill = c("red", "blue", "green")) %>%
dplyr::mutate(section = as.character(section))
if(c==0){
ds_data <- suppressWarnings(dplyr::bind_rows(ds_data_L, ds_data_R)) %>%
dplyr::full_join(filldf, by = "section") %>% filter(Pr!=0) %>%
dplyr::full_join(filldf, by = "section") %>% mutate(section = ordered(section, levels=c("L","R")))
ds_data <- ds_data[order(ds_data$section), ] %>%
filter(Pr!=0) %>%
mutate(Pr=scales::percent(Pr))
}else{
ds_data <- suppressWarnings(dplyr::bind_rows(ds_data_Center, ds_data_L, ds_data_R)) %>%
dplyr::full_join(filldf, by = "section") %>% mutate(section = ordered(section, levels=c("L","Center","R")))
ds_data <- ds_data[order(ds_data$section), ] %>%
filter(Pr!=0) %>%
mutate(Pr=scales::percent(Pr))
}
fillScale <- scale_fill_manual(name = paste0("c = ", c, ":"),
values = as.character(unique(ds_data$fill)))
p <- ggplot(data = ds_data, aes(x=x, y=y, fill=Pr)) +
geom_area() + fillScale
A,我无法弄清楚如何将颜色分配给不同的部分,同时将百分比保留为颜色的标签.
Alas, I cannot figure out how to assign the colors to the different sections while keeping the percentages as labels for the colors.
推荐答案
我们使用density
函数创建将实际绘制的数据框.然后,我们使用cut
函数使用数据值的范围来创建组.最后,我们计算每个组的概率质量,并将其用作实际的图例标签.
We use the density
function to create the data frame we'll actually plot. Then, We use the cut
function to create groups using ranges of the data values. Finally, we calculate the probability mass for each group and use those as the actual legend labels.
我们还创建了一个带颜色的标记矢量,以确保相同的颜色始终在给定的x值范围内进行,无论数据是否包含在给定的x值范围内的任何值.
We also create a labeled vector of colors to ensure that the same color always goes with a given range of x-values, regardless of whether the data contains any values within a given range of x-values.
下面的代码将所有这些打包到一个函数中.
The code below packages all this into a function.
library(tidyverse)
library(gridExtra)
fill_density = function(x, cc=1, adj=1, drop_levs=FALSE) {
# Calculate density values for input data
dens = data.frame(density(x, n=2^10, adjust=adj)[c("x","y")]) %>%
mutate(section = cut(x, breaks=c(-Inf, -1, cc, Inf))) %>%
group_by(section) %>%
mutate(prob = paste0(round(sum(y)*mean(diff(x))*100),"%"))
# Get probability mass for each level of section
# We'll use these as the label values in scale_fill_manual
sp = dens %>%
group_by(section, prob) %>%
summarise %>%
ungroup
if(!drop_levs) {
sp = sp %>% complete(section, fill=list(prob="0%"))
}
# Assign colors to each level of section
col = setNames(c("red","blue","green"), levels(dens$section))
ggplot(dens, aes(x, y, fill=section)) +
geom_area() +
scale_fill_manual(labels=sp$prob, values=col, drop=drop_levs) +
labs(fill="")
}
现在让我们在几种不同的数据分布上运行该函数:
Now let's run the function on several different data distributions:
set.seed(3)
dat2 = rnorm(1000)
grid.arrange(fill_density(mydata), fill_density(mydata[mydata>0]),
fill_density(mydata[mydata>2], drop_levs=TRUE),
fill_density(mydata[mydata>2], drop_levs=FALSE),
fill_density(mydata[mydata < -5 | mydata > 5], adj=0.3), fill_density(dat2),
ncol=2)
这篇关于用ggplot固定填充密度图的不同部分的文章就介绍到这了,希望我们推荐的答案对大家有所帮助,也希望大家多多支持IT屋!