订单数据在ggplot2中绘制barplot [英] order data to plot barplot in ggplot2

查看:179
本文介绍了订单数据在ggplot2中绘制barplot的处理方法,对大家解决问题具有一定的参考价值,需要的朋友们下面随着小编来一起学习吧!

问题描述

我需要建立我的数据的barplot,显示不同样本中的细菌相对丰度(每个列在总数据集中总和应为1)。

子集我的数据:

 > mydata 


Taxon CD6 CD1 CD12
Actinomycetaceae; g__Actinomyces 0.031960309 0.066683743 0.045638509
Coriobacteriaceae; g__Atopobium 0.018691589 0.003244536 0.00447774
棒状杆菌科; g__棒状杆菌0.001846083 0.006403689 0.000516662
Micrococcaceae; g__Rothia 0.001730703 0.000426913 0.001894429
Porphyromonadaceae; g__Porphyromonas 0.073497173 0.065915301 0.175406872

(CD6,CD1,CD12),其中y值是细菌种类的相对丰度(Taxon列)。

I认为(但我不确定)我的数据格式不适合做这个情节,因为我没有一个变量来组合,就像我发现的例子一样......


ggplot(data)+ geom_bar(aes(x = revision,y = added),stat =identity,fill =white,color =black p>

有没有一种方法可以将数据排序为正确的inp ut到这个代码?
或者我该如何修改它?
Thanks!

解决方案

你想要这样的东西吗?

 #sample data 
df < - read.table(header = T,sep =,text =
Taxon CD6 CD1 CD12
Actinomycetaceae; g__Actinomyces 0.031960309 0.066683743 0.045638509
Coriobacteriaceae; g__Atopobium 0.018691589 0.003244536 0.00447774
Corynebacteriaceae; g__Corynebacterium 0.001846083 0.006403689 0.000516662
Micrococcaceae; g__Rothia 0.001730703 0.000426913 0.001894429
Porphyromonadaceae; g__Porphyromonas 0.073497173 0.065915301 0.175406872)

#将宽数据格式转换为长格式
require(reshape2)
df.long < - melt(df,id .vars =Taxon,
measure.vars = grep(CD \\d +,names(df),val = T),
variable.name =sample,
value.name =value)

#计算比例
require(plyr)
df.long< - ddply(df.long,。(sample), transform,value = value / sum(value))

#以id
的顺序排列样本df.long $ sample< - reorder(df.long $ sample,as.numer ic(sub(CD,,df.long $ sample)))

#plot using ggplot
require(ggplot2)
ggplot(df.long,aes (x = sample,y = value,fill = Taxon))+
geom_bar(stat =identity)+
scale_fill_manual(values = scales :: hue_pal(h = c(0,360)+ 15,#添加手动颜色
c = 100,
l = 65,
h.start = 0,
direction = 1)(length(levels(df $ Taxon)))))


I need to build a barplot of my data, showing bacterial relative abundance in different samples (each column should sum to 1 in the complete dataset).

A subset of my data:

> mydata


Taxon   CD6 CD1 CD12
Actinomycetaceae;g__Actinomyces 0.031960309 0.066683743 0.045638509
Coriobacteriaceae;g__Atopobium  0.018691589 0.003244536 0.00447774
Corynebacteriaceae;g__Corynebacterium   0.001846083 0.006403689 0.000516662
Micrococcaceae;g__Rothia    0.001730703 0.000426913 0.001894429
Porphyromonadaceae;g__Porphyromonas 0.073497173 0.065915301 0.175406872

What I'd like to have is a bar for each sample (CD6, CD1, CD12), where the y values are the relative abundance of bacterial species (the Taxon column).

I think (but I'm not sure) my data format is not right to do the plot, since I don't have a variable to group by like in the examples I found...

ggplot(data) + geom_bar(aes(x=revision, y=added), stat="identity", fill="white", colour="black")

Is there a way to order my data making them right as input to this code? Or how can I modify it? Thanks!

解决方案

Do you want something like this?

# sample data
df <- read.table(header=T, sep=" ", text="
Taxon CD6 CD1 CD12
Actinomycetaceae;g__Actinomyces 0.031960309 0.066683743 0.045638509
Coriobacteriaceae;g__Atopobium 0.018691589 0.003244536 0.00447774
Corynebacteriaceae;g__Corynebacterium 0.001846083 0.006403689 0.000516662
Micrococcaceae;g__Rothia 0.001730703 0.000426913 0.001894429
Porphyromonadaceae;g__Porphyromonas 0.073497173 0.065915301 0.175406872")

# convert wide data format to long format
require(reshape2)
df.long <- melt(df, id.vars="Taxon",
                measure.vars=grep("CD\\d+", names(df), val=T),
                variable.name="sample",
                value.name="value")

# calculate proportions
require(plyr)
df.long <- ddply(df.long, .(sample), transform, value=value/sum(value))

# order samples by id     
df.long$sample <- reorder(df.long$sample, as.numeric(sub("CD", "", df.long$sample)))

# plot using ggplot
require(ggplot2)
ggplot(df.long, aes(x=sample, y=value, fill=Taxon)) + 
  geom_bar(stat="identity") +
  scale_fill_manual(values=scales::hue_pal(h = c(0, 360) + 15, # add manual colors
                                           c = 100, 
                                           l = 65, 
                                           h.start = 0, 
                                           direction = 1)(length(levels(df$Taxon))))

这篇关于订单数据在ggplot2中绘制barplot的文章就介绍到这了,希望我们推荐的答案对大家有所帮助,也希望大家多多支持IT屋!

查看全文
登录 关闭
扫码关注1秒登录
发送“验证码”获取 | 15天全站免登陆