核心密度绘制带有`facet_wrap`的ggplot2中的带宽 [英] Kernel density plot bandwidth in ggplot2 with `facet_wrap`

查看:165
本文介绍了核心密度绘制带有`facet_wrap`的ggplot2中的带宽的处理方法,对大家解决问题具有一定的参考价值,需要的朋友们下面随着小编来一起学习吧!

问题描述

我想在 ggplot2中使用 stat_density() facet_wrap() 包为不同的分组创建内核密度图,但是我想确保我为每个图使用相同的带宽。我可以确定 stat_density()对每个图都使用相同的带宽吗?

例如,使用 diamonds

  library(ggplot2)
ggplot(diamonds,aes (x = carat))+
stat_density()+
facet_wrap(〜cut)+
scale_x_log()

在文档中,它显示我可以使用 adjust 来调整自动带宽,但这只适用于一个倍数,并使我返回原来的问题。 stat_density()也有一个 ... 选项,但我无法通过 density()选项 bw ,如下所示:

  ggplot(钻石,aes(x =克拉))+ 
stat_density(bw = 1)+
facet_wrap(〜cut)+
scale_x_log()$ b $因此,如果 stat_density()没有使用>
$ b>

所有方面的带宽相同,是否有办法强制执行?我使用 transform() density() ddply() c $ c>,但因为 density()不一定返回与输入相同数量的x和y值,所以失败。有任何想法吗?

$ p $ b看起来像 ggplot2 指定了一个最佳带宽(看起来像@Ramnath和Dianardo,Fortin,和Lemieux 1996年的经济计量学一致),而不是我寻求的恒定带宽。但是,如果我确实需要所有方面的恒定带宽,我的下面的尝试失败。

x){
temp < - density(x $ carat,bw = 0.5)
return(data.frame(carat = temp $ x,density = temp $ y))
}
temp < - ddply(diamonds,。(cut),my.density)
ggplot(temp,aes(x = carat,y = density))+
geom_point()+
facet_wrap(〜cut)+
scale_x_log()
警告信息:
1:在match.fun(get(。transform,。))(values):NaNs $ b.b 2:在match.fun(get(。transform,。))(values):NaNs产生
3:在match.fun(get(。transform,。))(values) :NaNs生成
4:在match.fun(get(。transform,。))(values):NaNs生成
5:在match.fun(get(。transform,。) )(值):NaNs产生
6:删除了包含缺失值(geom_point)的84行。
7:删除了包含缺失值(geom_point)的113行。
8:删除了包含缺失值(geom_point)的98行。
9:删除了包含缺失值(geom_point)的98行。
10:删除了包含缺失值(geom_point)的106行。


解决方案

警告是由于 carat my.density 中。稍微修改一下你的代码就可以做到这一点:

  ggplot(temp,aes(x = carat,y = density)) + 
geom_line(subset =。(carat> 0))+
facet_wrap(〜cut)+ scale_x_log()

希望这有用

I would like to use stat_density() and facet_wrap() in the ggplot2 package to create kernel density plots for different groupings, but I want to make sure that I use the same bandwidth for every plot. Can I be sure that stat_density() uses the same bandwidth for every plot?

For example, using diamonds:

library(ggplot2)    
ggplot(diamonds, aes(x = carat)) + 
  stat_density() + 
  facet_wrap(~ cut) + 
  scale_x_log()

In the documentation it shows that I can use adjust to adjust the automatic bandwidth, but this just applies a multiple and returns me to the original question. stat_density() also has a ... option, but I haven't been able to pass though the density() option bw, like this:

ggplot(diamonds, aes(x = carat)) + 
  stat_density(bw = 1) + 
  facet_wrap(~ cut) + 
  scale_x_log()

So, if stat_density() isn't using the same bandwidth across all facets, is there a way that I can force this? I tried a ddply() solution with transform() and density(), but this fails because density() doesn't necessarily return the same number of x and y values as the input. Any ideas? Thanks!

Edit It looks like ggplot2 assigns an optimal bandwidth to each facet (it looks like @Ramnath and Dianardo, Fortin, and Lemieux Econometrica 1996 agree with this), not the constant bandwidth I was seeking. But, if I did want a constant bandwidth across all facets, my attempt below fails.

my.density <- function(x) {
    temp <- density(x$carat, bw = 0.5)
    return(data.frame(carat = temp$x, density = temp$y))
}
temp <- ddply(diamonds, .(cut), my.density)
ggplot(temp, aes(x = carat, y = density)) + 
             geom_point() + 
             facet_wrap(~ cut) + 
             scale_x_log()
Warning messages:
1: In match.fun(get(".transform", .))(values) : NaNs produced
2: In match.fun(get(".transform", .))(values) : NaNs produced
3: In match.fun(get(".transform", .))(values) : NaNs produced
4: In match.fun(get(".transform", .))(values) : NaNs produced
5: In match.fun(get(".transform", .))(values) : NaNs produced
6: Removed 84 rows containing missing values (geom_point). 
7: Removed 113 rows containing missing values (geom_point). 
8: Removed 98 rows containing missing values (geom_point). 
9: Removed 98 rows containing missing values (geom_point). 
10: Removed 106 rows containing missing values (geom_point). 

解决方案

The warnings are on account of the negative values for carat in my.density. A slight modification of your code would do the trick:

  ggplot(temp, aes(x = carat, y = density)) + 
    geom_line(subset = .(carat > 0)) +
   facet_wrap(~ cut) + scale_x_log() 

Hope this is useful

这篇关于核心密度绘制带有`facet_wrap`的ggplot2中的带宽的文章就介绍到这了,希望我们推荐的答案对大家有所帮助,也希望大家多多支持IT屋!

查看全文
登录 关闭
扫码关注1秒登录
发送“验证码”获取 | 15天全站免登陆