格式化数据以在ggplot2(R) [英] formatting data for faceting in ggplot2 (R)

查看:168
本文介绍了格式化数据以在ggplot2(R)的处理方法,对大家解决问题具有一定的参考价值,需要的朋友们下面随着小编来一起学习吧!

问题描述

我试图在ggplot2中完成一些不应该是大问题的事情,但这会阻碍我。



我需要导入这个xls文件: https://dl.dropboxusercontent.com/u/73950/mydata.xls并对其进行格式化,以便可以像我自己绘制的模型一样显示4个多面线图(标题为m,mNC,d,aSPL)(仅显示mNC面,但另一个应该建立在相同的模型上):





现在,我认为诀窍是这些列被命名为:PE-GED-nMC ,GA-GED-nMC,N1-GED-nMC,N2-GED-nMC等,我需要以某种方式告诉R根据这些列名的一部分来排列数据。我认为...



有人有关于如何从data.xls获得4个刻面数字的线索吗?


$ b $我认为这应该提供你正在寻找的东西 - 按照马特的建议应用融化。 grepl函数应该可以帮助你解析文本标签。我不知道水平和垂直类别是什么,所以只给了它们通用名称。



显然,ifelse构造有点麻烦,并且可能需要更优雅的解决方案

  require(reshape2)
#use融化从宽数据到长数据
dataM = melt(data,c(nbr))

#parse标签来标识垂直类别并相应地填写数值
dataM $ vertical = ifelse(grepl(GED ,DATAM $变量), GED,ifelse(grepl( RAN,DATAM $变量), RAN,ifelse(grepl( EIG,DATAM $变量), EIG, BET)))
#parse标签来标识水平类别并相应地填写数值
dataM $ horizo​​ntal = ifelse(grepl(PE,dataM $ variable),PE,ifelse(grepl(GA,dataM $变量),GA,ifelse(grepl(N1,dataM $ variable),N1,N2)))
#parse标识类别
dataM $ category = ifelse (grepl( MNC,DATAM $变量), MNC,ifelse(grepl( ASPL,DATAM $变量), ASPL,ifelse(克repl(_ d,dataM $ variable),d,m)))

#创建具有子设置数据的ggplot对象
p1 = ggplot(dataM [dataM $类别==mNC,],aes(x = nbr,y = value))
p1 = p1 + geom_line()
#face_grid创建您正在查找的面板(用法是vertical_categories〜 horizo​​ntal_categories)
p1 = p1 + facet_grid(vertical-horizo​​ntal)
p1

p2 = ggplot(dataM [dataM $ category ==aSPL,],aes(x = nb,y = value))
p2 = p2 + geom_line()
p2 = p2 + facet_grid(vertical_ horizo​​ntal)
p2

p3 = ggplot(dataM [dataM $ category ==d,],aes(x = nbr,y = value))
p3 = p3 + geom_line()
p3 = p3 + facet_grid(vertical〜horizo​​ntal)
p3
$ b p4 = ggplot(dataM [dataM $ category ==m,],aes(x = nbr,y = value))
p4 = p4 + geom_line()
p4 = p4 + facet_grid(vertical〜horizo​​ntal)
p4


I'm trying to accomplish something in ggplot2 that shouldn't be a big deal, but that blocks me somehow.

I need to import this xls file: https://dl.dropboxusercontent.com/u/73950/mydata.xls and format it so I can display 4 facetted line graphs (titled "m", "mNC", "d", "aSPL") like in this model drawed by myself (only showing the "mNC" facet, but the other one should be built on the same model):

Now, the trick, I think, is that the columns are named as such: "PE-GED-nMC", "GA-GED-nMC", "N1-GED-nMC", "N2-GED-nMC", etc. and I need to somehow tell R to arrange data according to parts of these column names. I think...

Does anyone have a clue about how to get from my data.xls to 4 faceted figures?

Cheers!

解决方案

I think this should deliver what you are looking for - applying melt as suggested by Matt. The grepl-function should help you parse the text labels. I did not know what the horizontal and the vertical categories really are so just gave them generic names.

Clearly the ifelse-constructs are somewhat cumbersome and may warrant a more elegant solution in a more complex setting.

require(reshape2)
#use melt to go from wide to long data
dataM = melt(data,c("nbr"))

#parse labels to identify vertical category and fill the value correspondingly
dataM$vertical = ifelse(grepl("GED",dataM$variable),"GED",ifelse(grepl("RAN",dataM$variable),"RAN",ifelse(grepl("EIG",dataM$variable),"EIG","BET")))
#parse labels to identify horizontal category and fill the value correspondingly
dataM$horizontal = ifelse(grepl("PE",dataM$variable),"PE",ifelse(grepl("GA",dataM$variable),"GA",ifelse(grepl("N1",dataM$variable),"N1","N2")))
#parse label to identify category
dataM$category = ifelse(grepl("mNC",dataM$variable),"mNC",ifelse(grepl("aSPL",dataM$variable),"aSPL",ifelse(grepl("_d",dataM$variable),"d","m")))

#create ggplot objects with sub-setted data
p1 = ggplot(dataM[dataM$category=="mNC",],aes(x=nbr,y=value))
p1 = p1 + geom_line()
#face_grid creates the panels that you are looking for (usage is vertical_categories ~ horizontal_categories)
p1 = p1 + facet_grid(vertical~horizontal)
p1

p2 = ggplot(dataM[dataM$category=="aSPL",],aes(x=nbr,y=value))
p2 = p2 + geom_line()
p2 = p2 + facet_grid(vertical~horizontal)
p2

p3 = ggplot(dataM[dataM$category=="d",],aes(x=nbr,y=value))
p3 = p3 + geom_line()
p3 = p3 + facet_grid(vertical~horizontal)
p3

p4 = ggplot(dataM[dataM$category=="m",],aes(x=nbr,y=value))
p4 = p4 + geom_line()
p4 = p4 + facet_grid(vertical~horizontal)
p4

这篇关于格式化数据以在ggplot2(R)的文章就介绍到这了,希望我们推荐的答案对大家有所帮助,也希望大家多多支持IT屋!

查看全文
登录 关闭
扫码关注1秒登录
发送“验证码”获取 | 15天全站免登陆