在ggplot中通过美学映射在每个组中显示stat_summary [英] Displaying stat_summary within each group, by aesthetic mapping, in ggplot

查看:322
本文介绍了在ggplot中通过美学映射在每个组中显示stat_summary的处理方法,对大家解决问题具有一定的参考价值,需要的朋友们下面随着小编来一起学习吧!

问题描述

我几乎要绘制所需的图形,但是还没有弄清楚stat_summary是否是显示所需图形的正确方法.

I am close to plotting what I wanted, but haven't quite figured out whether stat_summary is the right way to display the desired plot.

所需的输出是散点图,其中包含每年 个类别中的中线.例如,在下面的图中,我想要类别A中1999、2000和2001的值的中线(即彩色3条线),然后在类别B中相同(所以总共6条中线).

The desired output is the scatter plot with a median line for each year, within each category. For example, in the plot below, I would want a median line for the values in 1999, 2000, and 2001 in Category A (i.e., 3 lines by color) and then the same in Category B (so 6 median lines total).

我在此处自从使用方面以来,似乎并没有达到我想要的目标.

I looked here, but this didn't seem to get at what I wanted since it was using facets.

我的情节看起来像是在每个类别的中位数之间之间画一条线. stat_summary可以只在每个类别中绘制一条中线吗?还是我需要使用其他方法(例如计算中位数并按类别将每条线添加到绘图中?

My plot looks like it is drawing a line between the medians of each category. Can stat_summary just draw a median line within each category, or do I need to use a different approach (like calculating the medians and adding each line to the plot by category?

可复制的简单示例

library(tidyverse)
library(lubridate)

# Sample data
Date     <- sort(sample(seq(as.Date("1999-01-01"), as.Date("2002-01-01"), by = "day"), 500))
Category <- rep(c("A", "B"), 250)
Value    <- sample(100:500, 500, replace = TRUE)

# Create data frame
mydata   <- data.frame(Date, Category, Value)

# Plot by category and color by year
p <- ggplot(mydata, aes(x = Category, y = Value,
                        color = factor(year(Date))
                        )
            ) + 
  geom_jitter() 
p


# Now add median values of each year for each group
p <- p +
  stat_summary(fun.y = median,
               geom  = "line",
               aes(color = factor(year(Date))),
               group = 1,
               size = 2
               )
p

推荐答案

这是使用geom_errorbar(而不是stat_summary)的另一种可能性

Here is another possibility using geom_errorbar (instead of stat_summary)

# Sample data
set.seed(2017);
Date     <- sort(sample(seq(as.Date("1999-01-01"), as.Date("2002-01-01"), by = "day"), 500))
Category <- rep(c("A", "B"), 250)
Value    <- sample(100:500, 500, replace = TRUE)
mydata   <- data.frame(Date, Category, Value)

mydata %>%
    mutate(colour = factor(year(Date))) %>%
    group_by(Category, year(Date)) %>%
    mutate(Median = median(Value)) %>%
    ggplot(aes(Category, Value, colour = colour)) +
    geom_jitter() +
    geom_errorbar(
        aes(ymin = Median, ymax = Median))

说明:我们预先计算每个year(Date)中每个Category的中值,并使用geom_errorbar绘制中线.

Explanation: We pre-compute median values per Category per year(Date) and draw median lines using geom_errorbar.

根据您的评论,如果您想使用summarise预先计算中位数,则可以将中位数存储在单独的data.frame

In response to your comment, if you wanted to use summarise to pre-compute median values you could store median values in a separate data.frame

df <- mydata %>%
    mutate(Year = as.factor(year(Date))) %>%
    group_by(Category, Year) %>%
    summarise(Median = median(Value))

ggplot(mydata, aes(Category, Value, colour = factor(year(Date)))) +
    geom_jitter() +
    geom_errorbar(
        data = df,
        aes(x = Category, y = Median, colour = Year, ymin = Median, ymax = Median))

它不像第一个解决方案那么干净(因为您需要在geom_errorbar中指定所有美学),但是结果图是相同的.

It's not quite as clean as the first solution (since you need to specify all aesthetics in geom_errorbar) but the resulting plot is the same.

这篇关于在ggplot中通过美学映射在每个组中显示stat_summary的文章就介绍到这了,希望我们推荐的答案对大家有所帮助,也希望大家多多支持IT屋!

查看全文
登录 关闭
扫码关注1秒登录
发送“验证码”获取 | 15天全站免登陆