在密度曲线下的区域着色,以标记最高密度间隔(HDI) [英] Shade an area under density curve, to mark the Highest Density Interval (HDI)

查看:53
本文介绍了在密度曲线下的区域着色,以标记最高密度间隔(HDI)的处理方法,对大家解决问题具有一定的参考价值,需要的朋友们下面随着小编来一起学习吧!

问题描述

我认为这应该很简单,但是尽管有大量在线信息,我还是迷失了.

I thought this should be straightforward, but I'm lost, despite tons of information online.

我的问题:我有一个数据点矢量,我想为其绘制密度曲线,然后为曲线下方的区域上色以表示最高密度间隔(HDI).自然,我想通过 ggplot2 包,特别是通过 qplot()来实现这一目标,因为我的数据是矢量,而不是数据帧.

可复制示例

My Problem: I have a vector of data points, for which I want to plot a density curve, then color the area under the curve to signify the Highest Density Interval (HDI). Naturally, I'm trying to achieve this with ggplot2 package, and specifically with qplot(), since my data comes as a vector, and not a data frame.

library(ggplot2)
library(HDInterval)

## create data vector
set.seed(789)
dat <- rnorm(1000)

## plot density curve with qplot and mark 95% hdi
qplot(dat, geom = "density")+ 
  geom_vline(aes(xintercept = c(hdi(dat))))

所以我明白了:

但是我真正想要的是这样的东西:

But what I really want is something like this:

是否有一种简单的方法可以通过 ggplot2 :: qplot 来实现?

Is there a simple way to achieve this with ggplot2::qplot?

推荐答案

您可以使用ggridges软件包进行此操作.诀窍是我们可以将 HDInterval :: hdi 作为分位数函数提供给 geom_density_ridges_gradient(),并且可以填充其生成的分位数".分位数"是下尾,中尾和上尾的数字.

You can do this with the ggridges package. The trick is that we can provide HDInterval::hdi as quantile function to geom_density_ridges_gradient(), and that we can fill by the "quantiles" it generates. The "quantiles" are the numbers in the lower tail, in the middle, and in the upper tail.

作为一般建议,我建议不要使用 qplot().很有可能会引起混乱,将向量放到小标题中并不是一件容易的事.

As a general point of advice, I would recommend against using qplot(). It's more likely going to cause confusion, and putting a vector into a tibble is not a lot of effort.

library(tidyverse)
library(HDInterval)
library(ggridges)
#> 
#> Attaching package: 'ggridges'
#> The following object is masked from 'package:ggplot2':
#> 
#>     scale_discrete_manual

## create data vector
set.seed(789)
dat <- rnorm(1000)

df <- tibble(dat)

## plot density curve with qplot and mark 95% hdi
ggplot(df, aes(x = dat, y = 0, fill = stat(quantile))) + 
  geom_density_ridges_gradient(quantile_lines = TRUE, quantile_fun = hdi, vline_linetype = 2) +
  scale_fill_manual(values = c("transparent", "lightblue", "transparent"), guide = "none")
#> Picking joint bandwidth of 0.227

reprex软件包(v0.3.0)于2019年12月24日创建sup>

Created on 2019-12-24 by the reprex package (v0.3.0)

scale_fill_manual()中的颜色按三组的顺序排列,因此,例如,如果您只想遮蔽左尾巴,则可以编写 values = c("lightblue","transparent","transparent").

The colors in scale_fill_manual() are in the order of the three groups, so if you, for example, only wanted to shade the left tail, you would write values = c("lightblue", "transparent", "transparent").

这篇关于在密度曲线下的区域着色,以标记最高密度间隔(HDI)的文章就介绍到这了,希望我们推荐的答案对大家有所帮助,也希望大家多多支持IT屋!

查看全文
登录 关闭
扫码关注1秒登录
发送“验证码”获取 | 15天全站免登陆