同一图上多个直方图上的法线密度曲线 [英] Normal density curves on multiple histograms on a same plot

查看:66
本文介绍了同一图上多个直方图上的法线密度曲线的处理方法,对大家解决问题具有一定的参考价值,需要的朋友们下面随着小编来一起学习吧!

问题描述

例如,我有一个数据框:

I have a dataframe, for example, as this:

sample1 <- seq(120,197, length.out =  60)
sample2 <- seq(113, 167, length.out = 60)
sample3 <- seq(90,180, length.out = 60)
sample4 <-seq(100, 160, length.out = 60)

df <- as.data.frame(cbind(sample1, sample2, sample3, sample4))

我现在需要为这四个变量创建直方图,以使它们全部共享相同的y轴,并且还需要在每个变量上覆盖正常密度曲线这些直方图.只要y轴相同,facet_wrap()就可以了.

I now need to create histograms for these four variables such that all of them share the same y-axis, and also need to overlay normal density curves on each of these histograms. facet_wrap() will be fine as long as the y-axis is same.

今天早些时候,我以为我已经在论坛的一位专家的指导下解决了这个问题,但后来意识到该解决方案只是覆盖了一条密度曲线,而不是正态分布的曲线.我已经尝试过使用ggplot以及基本绘图功能使用数字选项,但是当拥有多个变量时,对于单个变量来说似乎简单的任务却是无法实现的?

Earlier today, I thought I had this issue resolved with the guidance of an expert in the forum but realised later that the solution just overlaid a density curve, not one with a normal distribution. I have tried a number options with ggplot as well as base plotting functions but what seems to be a simple task for a single variable isn't quite achievable when having multiple variables??

关于如何解决这个问题的任何想法?

Any thoughts about how to go tackle this?

谢谢

推荐答案

这是使用 tidyverse

library(tidyverse)

# example data
sample1 <- seq(120, 197, length.out =  60)
sample2 <- seq(113, 167, length.out = 60)
sample3 <- seq(90, 180, length.out = 60)
sample4 <- seq(100, 160, length.out = 60)

df <- data.frame(sample1, sample2, sample3, sample4)

# update your original dataframe to a nested dataframe by adding simulated values from normal distribution 
df2 = df %>%
  gather() %>%                                                           # reshape data  
  group_nest(key) %>%                                                    # for each key (i.e. sample)
  mutate(norm = map(data, ~rnorm(10000, mean(.x$value), sd(.x$value))))  # simulate 10K observations from the corresponding normal distribution

ggplot()+
  # plot histogram using info from nested column data (i.e. original observations)
  geom_histogram(data = df2 %>% unnest(data), aes(value, fill=key, ..density..), alpha=0.3)+
  # plot density using info from nested column norm (i.e. simulated normal observations)
  geom_density(data = df2 %>% unnest(norm), aes(norm, col=key))+
  # separate plots by key (i.e. sample)
  facet_wrap(~key)

这篇关于同一图上多个直方图上的法线密度曲线的文章就介绍到这了,希望我们推荐的答案对大家有所帮助,也希望大家多多支持IT屋!

查看全文
登录 关闭
扫码关注1秒登录
发送“验证码”获取 | 15天全站免登陆