使用ggplot将数据帧的每一列绘制为一行 [英] Plotting each column of a dataframe as one line using ggplot

查看:164
本文介绍了使用ggplot将数据帧的每一列绘制为一行的处理方法,对大家解决问题具有一定的参考价值,需要的朋友们下面随着小编来一起学习吧!

问题描述

为了重现这个例子,数据集可以在以下位置获得:

$ b

a href =https://www.dropbox.com/s/y1905suwnlib510/example_dataset.txt?dl=0 =nofollow noreferrer> https://www.dropbox.com/s/y1905suwnlib510/example_dataset.txt ?dl = 0

<54kb文件



您可以阅读为:

  test_example<  -  read.table(file ='example_dataset.txt')
pre>

我想在我的情节中有什么是 this



在图上,x轴是我的时间点列,y轴是数据集上的列,除了最后3列。然后我使用facet_wrap()通过ConditionID列进行分组。



这正是我想要的,但是我通过以下代码实现了这一点:

  plot <-ggplot(数据集,aes(x =时间点))
plot < - plot + geom_line(aes( y = dataset [,1],color = dataset $ InModule))
plot < - plot + geom_line(aes(y = dataset [,2],color = dataset $ InModule))
plot< ; - plot + geom_line(aes(y = dataset [,3],color = dataset $ InModule))
plot < ))
plot < - plot + geom_line(aes(y = dataset [,5],color = dataset $ InModule))
plot < - plot + geom_line(aes(y = dataset [ 6),color = dataset $ InModule))
plot < - plot + geom_line(aes(y = dataset [,7],color = dataset $ InModule))
plot < (aes(y = dataset [,8],color = dataset $ InModule))
...

正如你所看到的,它不是很自动化。我想过放入一个循环,比如

  columns < -  dim(dataset)[2]  -  3 
对于(i seq(1:列))
{
plot< - plot + geom_line(aes(y = dataset [,i],color = dataset $ InModule))
}
(plot < - plot + facet_wrap(〜ConditionID,ncol = 6))

这是行不通的。
我发现这个话题
使用for循环在单个绘图中用ggplot2绘制多条线,这对应于我的问题。
我尝试了熔解()函数给出的解决方案。



问题是,当我在我的数据集上使用融化时,我失去了时间点列绘制成我的X轴。这就是我的做法:

$ $ $ $ $ $ $ $ $ $ $ $ $ $ $ $ $ $ $ $ $ $ $ $ $ $ $ $ b dataset_melted< - melt(data_melted)

我尝试使用集合

  aggdata< -aggregate(dataset,by = list(dataset $ ConditionID),FUN = length)
pre>

现在有了aggdata,至少我得到了每个ConditionID有多少个时间点的信息,但是我不知道如何从这里开始并结合这个ggplot。



任何人都可以向我推荐一种方法。
我知道我可以使用在rbind的循环中创建新数据集的丑陋解决方案(也在该链接中给出),但我不想那样做,因为它听起来效率很低。我想学习正确的方法。



谢谢 你必须在 melt.data.frame 的调用中指定 id.vars 以保留所有您需要的信息。在调用 ggplot 时,您需要指定正确的分组变量以获得与以前相同的结果。这里有一个可能的解决方案:

 融化<  -  melt(dataset,id.vars = c(Timepoints,InModule ,ConditionID))
p < - ggplot(熔化,aes(Timepoints,value,color = InModule))+
geom_line(aes(group = paste0(variable,InModule)))
p


The whole dataset describes a module (or cluster if you prefer).

In order to reproduce the example, the dataset is available at: https://www.dropbox.com/s/y1905suwnlib510/example_dataset.txt?dl=0

(54kb file)

You can read as:

test_example <- read.table(file='example_dataset.txt')

What I would like to have in my plot is this

On the plot, the x-axis is my Timepoints column, and the y-axis are the columns on the dataset, except for the last 3 columns. Then I used facet_wrap() to group by the ConditionID column.

This is exactly what I want, but the way I achieved this was with the following code:

plot <- ggplot(dataset, aes(x=Timepoints))
plot <- plot + geom_line(aes(y=dataset[,1],colour = dataset$InModule))
plot <- plot + geom_line(aes(y=dataset[,2],colour = dataset$InModule))
plot <- plot + geom_line(aes(y=dataset[,3],colour = dataset$InModule))
plot <- plot + geom_line(aes(y=dataset[,4],colour = dataset$InModule))
plot <- plot + geom_line(aes(y=dataset[,5],colour = dataset$InModule))
plot <- plot + geom_line(aes(y=dataset[,6],colour = dataset$InModule))
plot <- plot + geom_line(aes(y=dataset[,7],colour = dataset$InModule))
plot <- plot + geom_line(aes(y=dataset[,8],colour = dataset$InModule))
...

As you can see it is not very automated. I thought about putting in a loop, like

columns <- dim(dataset)[2] - 3
for (i in seq(1:columns))
{
  plot <- plot + geom_line(aes(y=dataset[,i],colour = dataset$InModule))
}
(plot <- plot + facet_wrap(  ~ ConditionID, ncol=6) )

That doesn't work. I found this topic Use for loop to plot multiple lines in single plot with ggplot2 which corresponds to my problem. I tried the solution given with the melt() function.

The problem is that when I use melt on my dataset, I lose information of the Timepoints column to plot as my x-axis. This is how I did:

data_melted <- dataset
as.character(data_melted$Timepoints)
dataset_melted <- melt(data_melted)

I tried using aggregate

aggdata <-aggregate(dataset, by=list(dataset$ConditionID), FUN=length)

Now with aggdata at least I have the information on how many Timepoints for each ConditionID I have, but I don't know how to proceed from here and combine this on ggplot.

Can anyone suggest me an approach. I know I could use the ugly solution of creating new datasets on a loop with rbind(also given in that link), but I don't wanna do that, as it sounds really inefficient. I want to learn the right way.

Thanks

解决方案

You have to specify id.vars in your call to melt.data.frame to keep all information you need. In the call to ggplot you then need to specify the correct grouping variable to get the same result as before. Here's a possible solution:

melted <- melt(dataset, id.vars=c("Timepoints", "InModule", "ConditionID"))
p <- ggplot(melted, aes(Timepoints, value, color = InModule)) +
  geom_line(aes(group=paste0(variable, InModule)))
p

这篇关于使用ggplot将数据帧的每一列绘制为一行的文章就介绍到这了,希望我们推荐的答案对大家有所帮助,也希望大家多多支持IT屋!

查看全文
登录 关闭
扫码关注1秒登录
发送“验证码”获取 | 15天全站免登陆