如何在R中找到两个密度与ggplot2的交集 [英] How to find the intersection of two densities with ggplot2 in R

查看:162
本文介绍了如何在R中找到两个密度与ggplot2的交集的处理方法,对大家解决问题具有一定的参考价值,需要的朋友们下面随着小编来一起学习吧!

问题描述

如何找到用ggplot2创建的两个密度图的交集?



来自名为的数据框中的样本合并: p>


futureChange direction

2009-10-26 0.9980446 long



2008-04-28 1.0277389不久



2012-07-09 1.0302413不久

2010-11-15 1.0017247不久


我使用这段代码创建密度图。

  ggplot(combined,aes(futureChange,fill = direction))
+ geom_density(alpha = 0.2)
+ ggtitle粘贴(符号Long SB Frequency,sep =))

我想找到哪里粉红色密度线与蓝色密度线相交。



我看到其他文章提到了 intersect 函数,但我无法弄清楚如何使用密度ggplot2,因为我没有密度向量。

$ b $ ggplot2 中的 stat_density 函数使用 R的密度函数。使用密度函数将给我们明确的密度估计值,我们可以使用它来找到交点(我在这里生成数据是因为给定的数据不足以执行密度计算):

  set.seed(10)
N < - 100
combined< - data.frame(futureChange = c(rnorm(N,mean = -1),rnorm(N,mean = 1)),
direction = rep(c(long,not long),each = N))

lower.limit < - min(合并$ futureChange)
upper.limit < - max(合并$ futureChange)
long.density < - 密度(子集(合并,方向==长)$ futureChange,从= lower.limit到= upper.limit,n = 2 ^ 10)
not.long.density < - density(subset合并,direction ==not long)$ futureChange,from = lower.limit,to = upper.limit,n = 2 ^ 10)

density.difference< - long.density $ y - not.long.density $ y
intersection.point< - long.density $ x [which(diff(density.difference> 0)!= 0) + 1]

ggplot(combined,aes(futureChange,fill = direction))+ geom_density(alpha = 0.2)+
geom_vline(xintercept = intersection.point,color =red))

一步一步地,我们首先计算每个组应该计算密度的界限( lower.limit upper.limit )。我们这样做是因为我们需要这些范围对于两种密度计算都是相同的,以便我们可以稍后进行比较。另外,我们用密度函数中的 n 参数来指定密度计算的点数(if你需要更准确的结果,增加这个结果)。

接下来,我们计算数据中每个组的密度。然后,我们想要找到相交点,因此我们可以计算所计算密度的差异,并查看它何时从正面切换到负面,反之亦然。命令 which(diff(density.difference> 0)!= 0)+ 1 会给我们这些开关发生的指数(我们加上一个因为差分),所以我们可以通过获取 long.density $ x (或者 not.long.density $ x 因为这些都是通过建设相同)。




How do I find the intersection of two density plots created with ggplot2?

A sample from the data frame named combined:

futureChange direction

2009-10-26 0.9980446 long

2008-04-28 1.0277389 not long

2012-07-09 1.0302413 not long

2010-11-15 1.0017247 not long

I create the density plot using this code.

ggplot(combined, aes(futureChange, fill = direction))  
+ geom_density(alpha = 0.2) 
+ ggtitle(paste(symbol,"Long SB Frequency",sep=" "))

I want to find where the pink density line intersects with the blue density line.

I saw other posts that mentioned the intersect function, but I can't figure out how to use it with a density ggplot2 since I don't have the density vectors.

解决方案

The stat_density function in ggplot2 uses R's density function. Using the density function will give us explicit values for the density estimation which we can use to find the intersection point (I generate data here because the given data isn't enough to perform density calculation):

set.seed(10)
N <- 100
combined <- data.frame(futureChange = c(rnorm(N, mean = -1), rnorm(N, mean = 1)),
                       direction = rep(c("long", "not long"), each = N))

lower.limit <- min(combined$futureChange)
upper.limit <- max(combined$futureChange)
long.density <- density(subset(combined, direction == "long")$futureChange, from = lower.limit, to = upper.limit, n = 2^10)
not.long.density <- density(subset(combined, direction == "not long")$futureChange, from = lower.limit, to = upper.limit, n = 2^10)

density.difference <- long.density$y - not.long.density$y
intersection.point <- long.density$x[which(diff(density.difference > 0) != 0) + 1]

ggplot(combined, aes(futureChange, fill = direction)) + geom_density(alpha = 0.2) + 
  geom_vline(xintercept = intersection.point, color = "red")

Taking this step by step, we first compute the limits over which the density for each group should be calculated (lower.limit and upper.limit). We do this because we need these ranges to be the same for both density calculations so that we can compare them later. Additionally, we specify the number of points over which the density is calculated with the n argument in the density function (if you want more accurate results, increase this).

Next, we calculate the densities for each group in the data. We then want to find the intersection, so we can take the difference of the calculated densities and see when it switches from positive to negative or vice versa. The command which(diff(density.difference > 0) != 0) + 1 will give us the indices at which these switches occur (we add one because of the differencing), so we can get the value of that intersection by taking the corresponding value in long.density$x (or not.long.density$x since those are the same by construction).

这篇关于如何在R中找到两个密度与ggplot2的交集的文章就介绍到这了,希望我们推荐的答案对大家有所帮助,也希望大家多多支持IT屋!

查看全文
登录 关闭
扫码关注1秒登录
发送“验证码”获取 | 15天全站免登陆