使用R在各个面上显示密度的主要峰值 [英] Showing major peaks in densities across facets using R

查看:94
本文介绍了使用R在各个面上显示密度的主要峰值的处理方法,对大家解决问题具有一定的参考价值,需要的朋友们下面随着小编来一起学习吧!

问题描述

我正在尝试使用构面内的ggplot绘制数据的分布/密度.这是我现在所拥有的,其中红线显示平均值,每个方面均显示平均值.现在在这里,平均值没有意义,我希望进行类似的绘图,其中用xintercept和文本显示密度的峰值.
我用于此方法的代码是:

I am trying to plot distributions/densities of data, using ggplot within facets. Here is what I have right now where the red line shows the mean with mean value shown in each facet. Now here, mean values do not make sense, I wish to have similar plotting where peak values in the density are shown with xintercept and text.
The code I used for the means is this:

data <- read.table("sample.csv", header=F, sep=',')
colnames(data) <- c("frame", "val")
attach(data)
library(ggplot2)
library(grid)

library(plyr)
xdat <- ddply(data,"frame", transform, val_mean = signif(mean(val),3), med.x = signif(mean(val),3), med.y=signif(mean(density(val)$y),3))

ppi <- 500
png("sample.png", width=4*ppi, height=4*ppi, res=ppi)

hp <-ggplot(data=data, aes(x=val))+
geom_density() +
geom_vline(aes(xintercept=val_mean),xdat, color="red",linetype="dashed",size=1) +
theme_bw()

hp<-hp + facet_wrap (~ frame, ncol=2, scales="free_y") +
geom_text(data = xdat, aes(x=med.x,y=med.y,label=val_mean))

print(hp)
dev.off()

以及用于绘制此图的数据是:

and the data used to plot this graph are:

data <- data.frame(
    "frame"=c(rep("A",9), rep("B", 13), rep("C", 7)), 
    "val"=c(1, rep(2,4), 4, 5, 6, rep(1,6), 2, rep(3,7), 1, rep(4,6))
    )

我知道有一些帖子使用R来查找值中的峰值.但是我希望在密度图中绘制峰,但我找不到任何解决方案(或者我错过了).是否可以实时计算R中的峰并在不同构面内绘制?非常感谢您的时间和帮助!

I know that there have been some posts where R has been used to find peaks in the values. But I wish to plot peaks in the densities and I am not able to find any solution for it (or maybe I missed it). Is it possible to calculate peaks on-the-fly in R and plot within different facets? Thank you very much in advance for your time and help!!

推荐答案

我假设您要标识每个构面中最大的单个峰–这就是分布方式.如果您的分布是多峰分布,那么我的答案将只会找出最大的峰.另一个问题的答案解释了 geom_density()使用 density()具有默认参数的函数.

I'm assuming you want to identify the single largest peak in each facet – this would be the mode of the distribution. If your distribution is multimodal, my answer will only identify the largest peak. This answer to another question explains that geom_density() uses the density() function w/ default arguments.

话虽如此,以下代码应该对您有用:

That being said, the following code should work for you:

library(ggplot2)
library(grid)
library(plyr)

data <- data.frame("frame"=c(rep("A",9), rep("B", 13), rep("C", 7)), "val"=c(1,rep(2,4),4,5,6,rep(1,6),2,rep(3,7),1,rep(4,6)))
attach(data)

densMode <- function(x){
    td <- density(x)
    maxDens <- which.max(td$y)
    list(x=td$x[maxDens], y=td$y[maxDens])
}
xdat <- ddply(data,"frame", transform, val_mean = signif(densMode(val)$x,3), med.x = signif(densMode(val)$x,3), med.y=signif(densMode(val)$y,3))

hp <- ggplot(data=data, aes(x=val)) + 
    geom_density() + 
    geom_vline(aes(xintercept=val_mean),xdat, color="red",linetype="dashed",size=1) + 
    theme_bw()

hp<- hp + 
    facet_wrap (~ frame, ncol=2, scales="free_y") + 
    geom_text(data = xdat, aes(x=med.x,y=med.y,label=val_mean))

hp

我唯一更改的行是那些确定了图形创建方式的行(我没有使用 png()),插入了 densMode()函数,并且在 xdat 的定义中使用 densMode().我还根据您的示例数据创建了一个data.frame(为了便于其他人回答,我将其提交给您作为问题的编辑内容.)

The only lines I changed were those that determined how the graph was created (I didn't use png()), inserting the densMode() function, and using densMode() in the definition of xdat. I also created a data.frame based on your example data (which I've submitted as an edit to your question, for the convenience of others who may want to answer).

代码产生下图:

这篇关于使用R在各个面上显示密度的主要峰值的文章就介绍到这了,希望我们推荐的答案对大家有所帮助,也希望大家多多支持IT屋!

查看全文
登录 关闭
扫码关注1秒登录
发送“验证码”获取 | 15天全站免登陆