ggplot从csv文件创建多线图 [英] ggplot to create multi line plot from csv file

查看:104
本文介绍了ggplot从csv文件创建多线图的处理方法,对大家解决问题具有一定的参考价值,需要的朋友们下面随着小编来一起学习吧!

问题描述

我对ggplot完全陌生(在某种程度上R)。我已经被可以使用ggplot创建的图表的质量吹走了,我正试图学习如何使用ggplot创建一个简单的多线图。



不幸的是,我还没有找到任何帮助我接近我想要做的教程:



我有一个包含以下数据的CSV文件:

  id,f1,f2,f3,f4,f5,f6 
30,0.841933670833,0.842101814883,0.842759547545,1.88961562347,1.99808377527, 0.841933670833
40,1.47207692205,1.48713866811,1.48717177671,1.48729643008,1.48743226992,1.48713866811
50,0.823895293045,0.900091982861,0.900710334491,0.901274168324,0.901413662472,0.901413662472

我想绘制:


  1. 第一列X轴

  2. 每一个后续的'列'作为一个线条图,在这些线条的点之间平滑以创建一个漂亮的平滑线条

  3. 对于f1,f2 ....

  4. 指定al颜色和添加标记(例如(如'+')交叉到列f2的线图(例如)。

我对ggplot真的很陌生,所以有真的没有超出阅读文件到R。



任何帮助让我创建如上所述的情节,将非常有教育意义,并有助于减少ggplot学习曲线。 / p>

解决方案

  dat < -  structure(list(id = c(30L,40L,50L ),f1 = c(0.841933670833,
1.47207692205,0.823895293045),f2 = c(0.842101814883,1.4713866811,
0.900091982861),f3 = c(0.842759547545,1.48717177671,0.900710334491
),f4 = c (1.88961562347,1.4729643008,0.901274168324),f5 = c(1.99808377527,
1.48743226992,0.901413662472),f6 = c(0.841933670833,1.48713866811,
0.901413662472)),.Names = c(id,f1 ,f2,f3,f4,f5,
f6),class =data.frame,row.names = c(NA,-3L))

在这里我会用 melt 。请阅读?melt.data.frame 了解更多信息。但是有一句话,这将数据从宽格式转换为长格式。

  library(reshape2)
dat.m< - melt(dat,id.vars ='id')

> dat.m
id变量值
1 30 f1 0.8419337
2 40 f1 1.4720769
3 50 f1 0.8238953
4 30 f2 0.8421018
5 40 f2 1.4871387
6 50 f2 0.9000920
7 30 f3 0.8427595
8 40 f3 1.4871718
9 50 f3 0.9007103
10 30 f4 1.8896156
11 40 f4 1.4872964
12 50 f4 0.9012742
13 30 f5 1.9980838
14 40 f5 1.4874323
15 50 f5 0.9014137
16 30 f6 0.8419337
17 40 f6 1.4871387
18 50 f6 0.9014137
>

然后绘制你想要的图:



< pre $ ggplot(dat.m,aes(x = id,y = value,color = variable))+
geom_line()+
geom_point(data = dat .m [dat.m $ variable =='f2',],cex = 2)

Where aes 定义美学,如x值,y值,颜色/颜色等,然后​​添加图层。在前面的例子中,我用 geom_line()添加了一行代码,用于在 ggplot()并添加了一个点 geom_point ,我只把它们放在 f2 变量上。



下面,我添加了一条平滑线,其中 geom_smooth()。请参阅文档了解更多信息,?geom_smooth

  ggplot(dat.m,aes(x = id,y = value,color = variable))+ 
geom_smooth()+
geom_point(data = dat.m [dat.m $变量=='f2',],shape = 3)

或所有形状。在这里,我将形状放在 ggplot()的美学中。通过将它们放在这里,它们适用于所有连续的层,而不是每次都必须指定它们。但是,我可以在任何后面的层中覆盖 ggplot()中提供的值:

  ggplot(dat.m,aes(x = id,y = value,color = variable,shape = variable))+ 
geom_smooth()+
geom_point()+
geom_point(data = dat,aes(x = id,y = f2,color ='red'),size = 10,shape = 2)

然而,一点 ggplot 理解只需要时间。通过文档和 ggplot2 网站中的一些示例进行操作。如果你的经验与我的经验是类似的,与它战斗几天或几周后,它最终会点击。关于数据,如果您将数据分配给 dat ,则代码不会更改。 dat< - read.csv(...)。我不使用 data 作为变量,因为它是内置函数。


I am completely new to ggplot (and to some extent R). I have been blown away with the quality of graphs that can be created using ggplot, and I am trying to learn how to create a simple multi line plot using ggplot.

Unfortunately, I haven't found any tutorials that help me get close to what I am trying to do:

I have a CSV file that contains the following data:

id,f1,f2,f3,f4,f5,f6
30,0.841933670833,0.842101814883,0.842759547545,1.88961562347,1.99808377527,0.841933670833
40,1.47207692205,1.48713866811,1.48717177671,1.48729643008,1.48743226992,1.48713866811
50,0.823895293045,0.900091982861,0.900710334491,0.901274168324,0.901413662472,0.901413662472

I would like to plot:

  1. the first column (id) on the X axis
  2. each subsequent 'column' as a line plot, with smoothing between the points of the line to create a nice smooth line
  3. A legend for f1, f2 ....
  4. Specify a line colour and add marks (e.g. crosses i.e. '+') to the line plot for column f2 (for example).

I am really new to ggplot, so have really not got beyond reading the file into R.

Any help in getting me create the plot as describe above, will be very educational and help reduce the ggplot learning curve.

解决方案

dat <- structure(list(id = c(30L, 40L, 50L), f1 = c(0.841933670833, 
1.47207692205, 0.823895293045), f2 = c(0.842101814883, 1.48713866811, 
0.900091982861), f3 = c(0.842759547545, 1.48717177671, 0.900710334491
), f4 = c(1.88961562347, 1.48729643008, 0.901274168324), f5 = c(1.99808377527, 
1.48743226992, 0.901413662472), f6 = c(0.841933670833, 1.48713866811, 
0.901413662472)), .Names = c("id", "f1", "f2", "f3", "f4", "f5", 
"f6"), class = "data.frame", row.names = c(NA, -3L))

from here I would use melt. Read ?melt.data.frame for more info. But in one sentence, this takes data from a "wide" format to a "long" format.

library(reshape2)
dat.m <- melt(dat, id.vars='id')

> dat.m
   id variable     value
1  30       f1 0.8419337
2  40       f1 1.4720769
3  50       f1 0.8238953
4  30       f2 0.8421018
5  40       f2 1.4871387
6  50       f2 0.9000920
7  30       f3 0.8427595
8  40       f3 1.4871718
9  50       f3 0.9007103
10 30       f4 1.8896156
11 40       f4 1.4872964
12 50       f4 0.9012742
13 30       f5 1.9980838
14 40       f5 1.4874323
15 50       f5 0.9014137
16 30       f6 0.8419337
17 40       f6 1.4871387
18 50       f6 0.9014137
> 

then plot however you'd like:

ggplot(dat.m, aes(x=id, y=value, colour=variable)) + 
  geom_line() +
  geom_point(data=dat.m[dat.m$variable=='f2',], cex=2)

Where aes defines the aesthetics such as the x value, y value, color/colour, etc. Then you add "layers". in the previous example I've added a line for what I defined in the ggplot() portion with geom_line() and added a point with geom_point where I only put them on the f2 variable.

below, I added a smoothed line with geom_smooth(). See the documentation for a bit more info on what this is doing, ?geom_smooth.

ggplot(dat.m, aes(x=id, y=value, colour=variable)) + 
  geom_smooth() + 
  geom_point(data=dat.m[dat.m$variable=='f2',], shape=3)

or shapes for all. Here I put shape in the aesthetics of ggplot(). By putting them here they apply to all successive layers rather than having to specify them each time. However, I can overwrite the values supplied in ggplot() in any later layer:

ggplot(dat.m, aes(x=id, y=value, colour=variable, shape=variable)) + 
  geom_smooth() + 
  geom_point() +
  geom_point(data=dat, aes(x=id, y=f2, color='red'), size=10, shape=2)

However, a bit of ggplot understanding just takes time. Work through some of the examples given in the documentation and on the ggplot2 website. If your experience is anything like mine, after fighting with it for a few days or weeks it will eventually click. Regarding the data, if you assign your data to dat, the code will not change. dat <- read.csv(...). I don't use data as a variable because it is a built in function.

这篇关于ggplot从csv文件创建多线图的文章就介绍到这了,希望我们推荐的答案对大家有所帮助,也希望大家多多支持IT屋!

查看全文
登录 关闭
扫码关注1秒登录
发送“验证码”获取 | 15天全站免登陆