使用R在各个面上显示密度的主要峰值 [英] Showing major peaks in densities across facets using R
问题描述
我正在尝试使用构面内的ggplot绘制数据的分布/密度.这是我现在所拥有的,其中红线显示平均值,每个方面均显示平均值.现在在这里,平均值没有意义,我希望进行类似的绘图,其中用xintercept和文本显示密度的峰值.
我用于此方法的代码是:
I am trying to plot distributions/densities of data, using ggplot within facets. Here is what I have right now where the red line shows the mean with mean value shown in each facet. Now here, mean values do not make sense, I wish to have similar plotting where peak values in the density are shown with xintercept and text.
The code I used for the means is this:
data <- read.table("sample.csv", header=F, sep=',')
colnames(data) <- c("frame", "val")
attach(data)
library(ggplot2)
library(grid)
library(plyr)
xdat <- ddply(data,"frame", transform, val_mean = signif(mean(val),3), med.x = signif(mean(val),3), med.y=signif(mean(density(val)$y),3))
ppi <- 500
png("sample.png", width=4*ppi, height=4*ppi, res=ppi)
hp <-ggplot(data=data, aes(x=val))+
geom_density() +
geom_vline(aes(xintercept=val_mean),xdat, color="red",linetype="dashed",size=1) +
theme_bw()
hp<-hp + facet_wrap (~ frame, ncol=2, scales="free_y") +
geom_text(data = xdat, aes(x=med.x,y=med.y,label=val_mean))
print(hp)
dev.off()
以及用于绘制此图的数据是:
and the data used to plot this graph are:
data <- data.frame(
"frame"=c(rep("A",9), rep("B", 13), rep("C", 7)),
"val"=c(1, rep(2,4), 4, 5, 6, rep(1,6), 2, rep(3,7), 1, rep(4,6))
)
我知道有一些帖子使用R来查找值中的峰值.但是我希望在密度图中绘制峰,但我找不到任何解决方案(或者我错过了).是否可以实时计算R中的峰并在不同构面内绘制?非常感谢您的时间和帮助!
I know that there have been some posts where R has been used to find peaks in the values. But I wish to plot peaks in the densities and I am not able to find any solution for it (or maybe I missed it). Is it possible to calculate peaks on-the-fly in R and plot within different facets? Thank you very much in advance for your time and help!!
推荐答案
我假设您要标识每个构面中最大的单个峰–这就是分布方式.如果您的分布是多峰分布,那么我的答案将只会找出最大的峰.另一个问题的答案解释了 geom_density()
使用 density()
具有默认参数的函数.
I'm assuming you want to identify the single largest peak in each facet – this would be the mode of the distribution. If your distribution is multimodal, my answer will only identify the largest peak. This answer to another question explains that geom_density()
uses the density()
function w/ default arguments.
话虽如此,以下代码应该对您有用:
That being said, the following code should work for you:
library(ggplot2)
library(grid)
library(plyr)
data <- data.frame("frame"=c(rep("A",9), rep("B", 13), rep("C", 7)), "val"=c(1,rep(2,4),4,5,6,rep(1,6),2,rep(3,7),1,rep(4,6)))
attach(data)
densMode <- function(x){
td <- density(x)
maxDens <- which.max(td$y)
list(x=td$x[maxDens], y=td$y[maxDens])
}
xdat <- ddply(data,"frame", transform, val_mean = signif(densMode(val)$x,3), med.x = signif(densMode(val)$x,3), med.y=signif(densMode(val)$y,3))
hp <- ggplot(data=data, aes(x=val)) +
geom_density() +
geom_vline(aes(xintercept=val_mean),xdat, color="red",linetype="dashed",size=1) +
theme_bw()
hp<- hp +
facet_wrap (~ frame, ncol=2, scales="free_y") +
geom_text(data = xdat, aes(x=med.x,y=med.y,label=val_mean))
hp
我唯一更改的行是那些确定了图形创建方式的行(我没有使用 png()
),插入了 densMode()
函数,并且在 xdat
的定义中使用 densMode()
.我还根据您的示例数据创建了一个data.frame(为了便于其他人回答,我将其提交给您作为问题的编辑内容.)
The only lines I changed were those that determined how the graph was created (I didn't use png()
), inserting the densMode()
function, and using densMode()
in the definition of xdat
. I also created a data.frame based on your example data (which I've submitted as an edit to your question, for the convenience of others who may want to answer).
代码产生下图:
这篇关于使用R在各个面上显示密度的主要峰值的文章就介绍到这了,希望我们推荐的答案对大家有所帮助,也希望大家多多支持IT屋!