多组直方图与组特定的频率 [英] Multi-group histogram with group-specific frequencies
问题描述
下面是一个模拟我自己的模拟数据集:
df <-data。 (c(CG,CC,GG),60,replace = T),
Study_Group = sample(c ,Pathology1,pathology2),60,replace = T))
尝试变种 p + geom_bar(aes(aes(y = ..count ../ sum(.. count ..))
但r返回找不到'count'对象或其他的东西。
我也试过:
df.new <-ddply(df,。(Study_Group),summary>
prop = prop.table(table(df $ Genotype)),
Genotype = names(table(df $ Genotype)) )`
我相信s出现错误ummarise的功能,但说实话,我不知道我在做什么。
问题只是我对解决方案的理解,还是它与我的内在不同数据集?
感谢您的帮助。 给这个一试。在这里,我使用的是 dplyr
,它是一个包,其中包含 ddply
类型函数的更新版本 plyr
。有一件事,我不知道你想让你的x轴是 Study_Group
s还是你的 Genotypes
。你的问题表明你想在每个组中使用 Genotype
的频率,但是你的图在x上有 Genotypes
。解决方案遵循所述的愿望,而不是情节。但是,对x进行更改以获取 Genotype
很简单。我会在代码注释中注明哪些地方以及需要做什么修改。
library(dplyr)
library(ggplot2)
df2 < - df% >%
count(Study_Group,Genotypes)%>%
group_by(Study_Group)%>%#为group_by(基因型)%>变更%#替代方法
mutate (prop = n / sum(n))
ggplot(data = df2,aes(Study_Group,prop,fill = Genotypes))+
geom_bar(stat =identity,position = dodge)
First off, I've already read the following thread: ggplot2 - Multi-group histogram with in-group proportions rather than frequency
I followed the ddply suggestion and it didn't seem to work for my data. Logically the code should work perfectly on my dataset and I have no idea what I'm doing wrong.
Overall: I'd like to make a histogram (I'm learning ggplot) that displays the genotype frequency in each of my study groups.
Something like this:
Here's a mock data set that mirrors my own:
df<-data.frame(ID=1:60,
Genotypes=sample(c("CG", "CC", "GG"), 60, replace=T),
Study_Group=sample(c("Control", "Pathology1", "pathology2"), 60, replace=T))
I've tried variants of p + geom_bar(aes(aes(y = ..count../sum(..count..))
but r returns "cannot find 'count' object" or something to that effect.
I also tried:
df.new<-ddply(df,.(Study_Group),summarise,
prop=prop.table(table(df$Genotype)),
Genotype=names(table(df$Genotype)))`
And I believe there was an error with the summarise function, but to be honest, I have no idea what I'm doing.
Is the problem simply my comprehension of the solution or is it something inherently different in my data set?
Thanks for the help.
Give this a try. In this, I am using dplyr
which is a package that contains updated versions of the ddply
-type functions from plyr
. One thing, I am not sure if you want to have your x-axis be the Study_Group
s or your Genotypes
. your question states you want the frequency of Genotype
within each group but your graph has the Genotypes
on the x. The solution follows the stated desire, not the plot. However, making the change to get Genotype
on the x is simple. I'll note in the code comments where and what change to make.
library(dplyr)
library(ggplot2)
df2 <- df %>%
count(Study_Group, Genotypes) %>%
group_by(Study_Group) %>% #change to `group_by(Genotypes) %>%` for alternative approach
mutate(prop = n / sum(n))
ggplot(data = df2, aes(Study_Group, prop, fill = Genotypes)) +
geom_bar(stat = "identity", position = "dodge")
这篇关于多组直方图与组特定的频率的文章就介绍到这了,希望我们推荐的答案对大家有所帮助,也希望大家多多支持IT屋!