削减数据和访问组以绘制百分比线 [英] Cut data and access groups to draw percentile lines

查看:92
本文介绍了削减数据和访问组以绘制百分比线的处理方法,对大家解决问题具有一定的参考价值,需要的朋友们下面随着小编来一起学习吧!

问题描述

我是R的新手,所以请保持柔和。

I'm very new to R so please be gentle.

我有一个包含时间戳和一些数据的数据集。
现在我想画一个图形,其中:

I have a dataset containing timestamps and some data. Now I'd like to draw a graph where:


  • 数据按例如60分钟的间隔和

  • 绘制一些百分位线。

我想拥有以时间为x轴,间隙为y轴的图。
我想像框线图一样,但为了得到更好的概述-因为我的测量距离很长-我想用线代替框

I'd like to have a graph with the time as x-axis and the gap as y-axis. I imagine something like boxplot but for a better overview - since I have a long measurement - instead of boxes I'd like to have lines that connect the


  • 平均值,

  • 3个百分点,

  • 97个百分点,

  • 100百分位数

  • mean values,
  • 3 percentiles,
  • 97 percentiles and
  • 100 percentiles

以下是示例数据:

> head(B, 10)
                        times     gaps
1  2013-06-10 15:40:02.654168 1.426180
2  2013-06-10 15:40:18.936882 2.246462
3  2013-06-10 15:40:35.215668 3.227132
4  2013-06-10 15:40:48.328785 1.331284
5  2013-06-10 15:40:53.809485 1.294128
6  2013-06-10 15:41:04.027745 2.292671
7  2013-06-10 15:41:25.876519 1.293501
8  2013-06-10 15:41:42.929280 1.342166
9  2013-06-10 15:42:11.700626 3.203901
10 2013-06-10 15:42:23.059550 1.304467

我可以使用cut来分割数据:

I can use cut to divide the data:

C <- table(cut(B, breaks="hour"))

C <- data.frame(cut(B, breaks="hour"))

但是我该如何绘制图形呢?我不知道如何访问组的差距值。否则我可以

But how can I draw the graph form this? I don't know how to access the gap values of the groups. Otherwise I could

quantile(C$gaps, c(.03, .5, .97, 1))

在此先感谢您的帮助
Ramon

Thanks in advance for any help Ramon

推荐答案

实心问题。我一直在梳理头发,直到找到,它描述了 plyr 的一个有趣的特征。因此,该解决方案利用ggplot,plyr,reshape2-希望是R的一个很好的介绍。如果您需要整天添加剪切,还可以通过在ddply()中添加变量来实现。

Solid question. I was pulling my hair out until I found this which described an interesting "feature" of plyr. So this solution utilizes ggplot, plyr, reshape2- hopefully a good intro to R. If you need to add cuts through days you can also do that by adding a variable in the ddply().

library(plyr)
library(reshape2)
library(ggplot2)
Hs <- read.table(
  header=TRUE, text='
dates times     gaps
1  2013-06-10 15:40:02.654168 1.426180
2  2013-06-10 15:40:18.936882 2.246462
3  2013-06-10 15:40:35.215668 3.227132
4  2013-06-10 15:40:48.328785 1.331284
5  2013-06-10 15:40:53.809485 1.294128
6  2013-06-10 15:41:04.027745 2.292671
7  2013-06-10 16:41:25.876519 1.293501
8  2013-06-10 16:41:42.929280 1.342166
9  2013-06-10 16:42:11.700626 3.203901
10 2013-06-10 16:42:23.059550 1.304467')
Hs$dates <- paste(Hs$date, Hs$times, sep = " ")
Hs$dates <- strptime(Hs$date, "%Y-%m-%d %H:%M:%S")
class(Hs$dates) # "POSIXlt" "POSIXt" 
Hs$h1 <- Hs$dates$hour
Hs$dates <- as.POSIXct(strptime(Hs$date, "%Y-%m-%d %H:%M:%S"))
class(Hs$dates) # "POSIXct" "POSIXt" 
library(ggplot2)
ggplot(Hs, aes(factor(h1), gaps)) + 
  geom_boxplot(fill="white", colour="darkgreen") # easy way!  Traditional boxplot.
ggplot(Hs, aes(factor(h1), gaps)) + geom_boxplot() +
      stat_boxplot(coef = 1.7, fill="white", colour="darkgreen") 

我不知道添加 coef = 1.7是否对您有用-
,如果不继续创建通过汇总表获取的值

I don't know if adding "coef = 1.7" works for you- if not continue further to create the values via a summary table

cuts <- c(.03, .5, .97, 1)
x <- ddply(Hs, .(h1), function (x)
{summarise(x, y = quantile(x$gaps, cuts))})
x$cuts <- cuts
x <- dcast(x, h1 ~ cuts, value.var = "y")
x.melt <- melt(x, id.vars = "h1")

以下是您要求的行以及另一个只是为了好玩的箱形图。

Here are the lines you requested plus another box plot just for fun.

ggplot(x.melt, aes(x = h1, y = value, color = variable)) + geom_point(size = 5) + 
  geom_line() + scale_colour_brewer(palette="RdYlBu") + xlab("hours")
ggplot(x, aes(factor(h1),  ymin = 0, lower = `0.03`, middle = `0.5`,
                     upper = `0.97`, ymax = `1`)) + 
         geom_boxplot(stat = "identity", fill="white", colour="darkgreen")

希望这会有所帮助。

这篇关于削减数据和访问组以绘制百分比线的文章就介绍到这了,希望我们推荐的答案对大家有所帮助,也希望大家多多支持IT屋!

查看全文
登录 关闭
扫码关注1秒登录
发送“验证码”获取 | 15天全站免登陆