我怎样才能叠加按组图元素ggplot2方面? [英] How can I overlay by-group plot elements to ggplot2 facets?

查看:112
本文介绍了我怎样才能叠加按组图元素ggplot2方面?的处理方法,对大家解决问题具有一定的参考价值,需要的朋友们下面随着小编来一起学习吧!

问题描述

我的问题与刻面有关。在下面的示例代码中,我会看到一些分面的散点图,然后尝试在每个面上覆盖信息(本例中为平均线)。



tl ; dr版本是我的尝试失败。无论是我添加的平均线计算所有数据(不考虑facet变量),或者我试图写一个公式,R会抛出一个错误,然后对我的母亲发表尖锐和特别贬低的评论。

  library(ggplot2)

#假设我们正在探索汽车重量与其
#马力之间的关系,示例数据
p <-ggplot()
p <-p + geom_point(aes(x = wt,y = hp),data = mtcars)
print(p)

#嗯。对数据进行快速检查发现,汽车重量可能大幅不同,几乎达到
#1千磅。
头(mtcars)

#差异很重要吗?特别是如果大多数8缸汽车很重,
#和大多数4缸汽车都很轻。 ColorBrewer来拯救!
<-p + aes(color = factor(cyl))
p <-p + scale_color_brewer(pal =Set1)
print(p)

#在这一点上,如果我们能够更强烈地从汽车的引擎模块中分离出
#汽车,那将是一件好事。
p < - p + facet_grid(〜cyl)
print(p)

#啊!现在我们可以看到(给定固定比例尺),4缸汽车重叠在重量测量值上的
#,而8缸汽车向右聚集。但是你知道
#会非常棒吗?如果我们可以在视觉上比较汽车群体的意义。
p.with.means< - p + geom_hline(
aes(yintercept = mean(hp)),
data = mtcars

print(p。 with.means)

#等等,那是不对的。这根本不对。绿色(8缸)汽车均高于他们团队的
#平均值。他们是以某种方式在明尼苏达州Wobegon湖的一家汽车厂制造的?显然,
#我的意思是绘制GROUP的平均线。除了很明显,因为下面的代码将
#打印一个错误,我不知道如何。
p.with.non.lake.wobegon.means< - p + geom_hline(
aes(yintercept = mean(hp)〜cyl),
data = mtcars

print(p.with.non.lake.wobegon.means)

>


  rs < -  ddply(mtcars,。(cyl),summary,mn = mean(hp))

p + geom_hline(data = rs,aes(yintercept = mn))

有可能做到这一点在 ggplot 调用中使用 stat _ * ,但我必须回头修改一下。但是一般来说,如果我将总结添加到分面图中,我会分别计算总结,然后将它们与它们自己的 geom 相加。



编辑



只是您原始尝试的一些扩展笔记。一般来说,在 ggplot 中放置 aes 调用是一个好主意,它将在整个图中保持不变,然后指定不同的数据集或那些与基本情节不同的 geom 的美学。那么你不需要在每个 geom 中指定 data = ...



最后,我提出了一种巧妙使用 geom_smooth 的方法来做类似于你的要求:

  p < -  ggplot(data = mtcars,aes(x = wt,y = hp,color = factor(cyl))+ 
facet_grid(〜cyl)+
geom_point()+
geom_smooth(se = FALSE,method =lm,formula = y〜1,color =black)

水平线(即常量回归eqn)只会扩展到每个方面的数据限制,但会跳过单独的数据总结步骤。


My question has to do with facetting. In my example code below, I look at some facetted scatterplots, then try to overlay information (in this case, mean lines) on a per-facet basis.

The tl;dr version is that my attempts fail. Either my added mean lines compute across all data (disrespecting the facet variable), or I try to write a formula and R throws an error, followed by incisive and particularly disparaging comments about my mother.

library(ggplot2)

# Let's pretend we're exploring the relationship between a car's weight and its
# horsepower, using some sample data
p <- ggplot()
p <- p + geom_point(aes(x = wt, y = hp), data = mtcars)
print(p)

# Hmm. A quick check of the data reveals that car weights can differ wildly, by almost
# a thousand pounds.
head(mtcars)

# Does the difference matter? It might, especially if most 8-cylinder cars are heavy,
# and most 4-cylinder cars are light. ColorBrewer to the rescue!
p <- p + aes(color = factor(cyl))
p <- p + scale_color_brewer(pal = "Set1")
print(p)

# At this point, what would be great is if we could more strongly visually separate
# the cars out by their engine blocks.
p <- p + facet_grid(~ cyl)
print(p)

# Ah! Now we can see (given the fixed scales) that the 4-cylinder cars flock to the
# left on weight measures, while the 8-cylinder cars flock right. But you know what
# would be REALLY awesome? If we could visually compare the means of the car groups.
p.with.means <- p + geom_hline(
                      aes(yintercept = mean(hp)),
                      data = mtcars
         )
print(p.with.means)

# Wait, that's not right. That's not right at all. The green (8-cylinder) cars are all above the
# average for their group. Are they somehow made in an auto plant in Lake Wobegon, MN? Obviously,
# I meant to draw mean lines factored by GROUP. Except also obviously, since the code below will
# print an error, I don't know how.
p.with.non.lake.wobegon.means <- p + geom_hline(
                                       aes(yintercept = mean(hp) ~ cyl),
                                       data = mtcars
                                     )
print(p.with.non.lake.wobegon.means)

There must be some simple solution I'm missing.

解决方案

You mean something like this:

rs <- ddply(mtcars,.(cyl),summarise,mn = mean(hp))

p + geom_hline(data=rs,aes(yintercept=mn))

It might be possible to do this within the ggplot call using stat_*, but I'd have to go back and tinker a bit. But generally if I'm adding summaries to a faceted plot I calculate the summaries separately and then add them with their own geom.

EDIT

Just a few expanded notes on your original attempt. Generally it's a good idea to put aes calls in ggplot that will persist throughout the plot, and then specify different data sets or aesthetics in those geom's that differ from the 'base' plot. Then you don't need to keep specifying data = ... in each geom.

Finally, I came up with a kind of clever use of geom_smooth to do something similar to what your asking:

p <- ggplot(data = mtcars,aes(x = wt, y = hp, colour = factor(cyl))) + 
    facet_grid(~cyl) + 
    geom_point() + 
    geom_smooth(se=FALSE,method="lm",formula=y~1,colour="black")

The horizontal line (i.e. constant regression eqn) will only extend to the limits of the data in each facet, but it skips the separate data summary step.

这篇关于我怎样才能叠加按组图元素ggplot2方面?的文章就介绍到这了,希望我们推荐的答案对大家有所帮助,也希望大家多多支持IT屋!

查看全文
登录 关闭
扫码关注1秒登录
发送“验证码”获取 | 15天全站免登陆