带有ggplot2的dplyr链中的子集/过滤器 [英] Subset/filter in dplyr chain with ggplot2
问题描述
我想按照此的线条制作斜边图(无双关) 。理想情况下,我想以dplyr样式的链条进行所有操作,但是当我尝试对数据进行子集添加特定的 geom_text
标签时遇到了麻烦。这是一个玩具示例:
I'd like to make a slopegraph, along the lines (no pun intended) of this. Ideally, I'd like to do it all in a dplyr-style chain, but I hit a snag when I try to subset the data to add specific geom_text
labels. Here's a toy example:
# make tbl:
df <- tibble(
area = rep(c("Health", "Education"), 6),
sub_area = rep(c("Staff", "Projects", "Activities"), 4),
year = c(rep(2016, 6), rep(2017, 6)),
value = rep(c(15000, 12000, 18000), 4)
) %>% arrange(area)
# plot:
df %>% filter(area == "Health") %>%
ggplot() +
geom_line(aes(x = as.factor(year), y = value,
group = sub_area, color = sub_area), size = 2) +
geom_point(aes(x = as.factor(year), y = value,
group = sub_area, color = sub_area), size = 2) +
theme_minimal(base_size = 18) +
geom_text(data = dplyr::filter(., year == 2016 & sub_area == "Activities"),
aes(x = as.factor(year), y = value,
color = sub_area, label = area), size = 6, hjust = 1)
但这给我 filter_( .data,.dots = lazyeval :: lazy _dots(...)):找不到
对象。使用子集而不是 dplyr :: filter
给了我类似的错误。我在SO / Google上发现的是这个问题,该问题解决了一个稍微不同的问题。
But this gives me Error in filter_(.data, .dots = lazyeval::lazy_dots(...)) :
object '.' not found
. Using subset instead of dplyr::filter
gives me a similar error. What I've found on SO/Google is this question, which addresses a slightly different problem.
像这样在链中对数据进行子集化的正确方法是什么?
What is the correct way to subset the data in a chain like this?
编辑:我的代表是一个简化的示例,在实际工作中我只有一条长链。下面的Mike注释适用于第一种情况,但不适用于第二种情况。
Edit: My reprex is a simplified example, in the real work I have one long chain. Mike's comment below works for the first case, but not the second.
推荐答案
如果将绘图代码包装在<$ c $中c> {...} ,您可以使用。
确切指定先前计算结果的插入位置:
If you wrap the plotting code in {...}
, you can use .
to specify exactly where the previously calculated results are inserted:
library(tidyverse)
df <- tibble(
area = rep(c("Health", "Education"), 6),
sub_area = rep(c("Staff", "Projects", "Activities"), 4),
year = c(rep(2016, 6), rep(2017, 6)),
value = rep(c(15000, 12000, 18000), 4)
) %>% arrange(area)
df %>% filter(area == "Health") %>% {
ggplot(.) + # add . to specify to insert results here
geom_line(aes(x = as.factor(year), y = value,
group = sub_area, color = sub_area), size = 2) +
geom_point(aes(x = as.factor(year), y = value,
group = sub_area, color = sub_area), size = 2) +
theme_minimal(base_size = 18) +
geom_text(data = dplyr::filter(., year == 2016 & sub_area == "Activities"), # and here
aes(x = as.factor(year), y = value,
color = sub_area, label = area), size = 6, hjust = 1)
}
虽然该图可能不是您真正想要的,至少它可以运行,因此您可以对其进行编辑。
While that plot is probably not what you really want, at least it runs so you can edit it.
发生了什么:通常%>%
会通过结果左侧(LHS)到右侧(RHS)的第一个参数的值。但是,如果将RHS括在括号中,则%>%
只会将结果传递到明确放置的位置。
。此公式对于嵌套的子管道或其他复杂的调用(如ggplot链)很有用,这些调用否则只能通过使用重定向。
才能解决。有关更多详细信息和选项,请参见 help('%>%','magrittr')
。
What's happening: Normally %>%
passes the results of the left-hand side (LHS) to the first parameter of the right-hand side (RHS). However, if you wrap the RHS in braces, %>%
will only pass the results in to wherever you explicitly put a .
. This formulation is useful for nested sub-pipelines or otherwise complicated calls (like a ggplot chain) that can't otherwise be sorted out just by redirecting with a .
. See help('%>%', 'magrittr')
for more details and options.
这篇关于带有ggplot2的dplyr链中的子集/过滤器的文章就介绍到这了,希望我们推荐的答案对大家有所帮助,也希望大家多多支持IT屋!