ggplot不等时间的多个时间序列 [英] ggplot multiple time series of unequal time

查看:240
本文介绍了ggplot不等时间的多个时间序列的处理方法,对大家解决问题具有一定的参考价值,需要的朋友们下面随着小编来一起学习吧!

问题描述

我知道有几个有关时间序列和多个数据框的回答问题,但我似乎无法弄清楚。



我想绘制4个不同压力传感器的时间标记数据(列pa)。我有来自同一个实验的时间标记压力读数的4个dfs。然而,传感器在不同的时间收集数据,并且由于传感器故障和数据中的其他闪烁,列的长度不相等。

这两个方面阻止我成功创建包含所有4个传感器数据的图表。所有的df观测数量都不相同,但在相同的范围内,但它们在秒级上有所不同。例如,时间分辨率是否需要更改为几小时?



这就是df的样子:PA_1 n = 1097361 b
$ b

  time pa wifi 
1 2014-09-01 16:21:00 100.620 1
2 2014-09-01 17:20:33 100.572 1
3 2014-09-01 18: 20:05 100.561 0
4 2014-09-01 19:19:38 100.523 0
5 2014-09-01 20:19:11 100.511 1
6 2014-09-01 21 :18:43 100.534 1

PA_2:n = 914364
time pa wifi
1 2014-09-01 15:25:05不适用1
2 2014- 09-01 15:25:09 100.798 1
3 2014-09-01 15:25:11 100.792 0
4 2014-09-01 15:25:15 100.791 0
5 2014 -09-01 15:25:18 100.790 1
6 2014-09-01 15:25:20 100.791 1

PA_3 n = 963527
time pa wifi
1 2014-09-01 15:25:02 100.832 1
2 2014-09-01 15:25:05 100.832 1
3 2014-09-01 15:25:08 100.825 0
4 2014-09-01 15:25:11 100.831 0
5 2014-09-01 15:25:14 100.830 1
6 2014-09-01 15:25:17 100.836 1

PA_4:n = 1061117
每次wifi
1 2014-09-01 15:25:00 100.690 1
2 2014-09-01 15:25:04 100.683 1
3 2014-09-01 15:25:07 100.685 0
4 2014-09-01 15:25:11 100.687 0
5 2014-09-01 15:25:14 100.682 1
6 2014-09-01 15:25: 18 100.684 1

此外,在df中加入了一个二分变量wifi在实验过程中打开或关闭。两个传感器暴露在无线网络中,而另外两个在WiFi信号之外。
我想在图表中显示。也许通过在实验过程中打开无线网络区域或增加线路的大小,但我不太确定如何做到这一点。为了说明这一点,我在例子中编辑了中间的2个wifi条目,但是一次只能连接10天,而不是几秒钟。


谢谢

编辑:添加每个df的示例并添加几个explinations $ b $我不清楚你在问什么,但是(如果这是你正在尝试做的),你可以结合data.frames然后绘制它们所有这些都在一张图表上,使用颜色来区分传感器,以及alpha / shape设置来区分wifi状态。然后,这个系列在不同的时间开始和结束都是没有问题的,并且有不同的测量分辨率。



类似这样的:

  library(ggplot2)
ggplot(dat,
aes(x = time,y = pa,group = sensor,
color = factor (传感器),alpha = factor(wifi)))+
geom_point(aes(shape = factor(wifi)),size = 3)+
geom_line()+
scale_alpha_manual(values = c(.3,1))

其中(使用完全随机数据)如下所示: p>



为了生成随机数据,我这样做了:

library(lubridate)

 $ b dat $  -  
data.frame (传感器=样品(1:4,n,替换= T),
hr =样品(1:24,n,替换= T),
min =样品(1:60,n,替换= T),
秒=样本(1:6 0,n,replace = T),
wifi = rbinom(n,1,.5),
pa = 100 + rnorm(n))

dat $ time< ; - (dat,ymd_hms(paste('2014-09-01',
paste(hr,min,sec,sep =':'))))


I know there are a few answered questions relating to timeseries and multiple dataframes, but I cant seem to figure this out.

I would like to plot time stamped data of 4 different pressure senors against time (column pa). I have 4 dfs of time stamped pressure readings from the same experiment. However, the sensors collected data at unequal times and the length of the columns are unequal due to sensor failures and other blips in the data.

These two aspects have prevented me from successfully creating a graph containing all 4 sensors' data. All of the df are of unequal number of observations but within the same range, but they differ at the seconds level. Would the time resolution need to be changed to hours, for example?

This is what the df looks like: PA_1 n=1097361

      time               pa       wifi
1 2014-09-01 16:21:00   100.620    1   
2 2014-09-01 17:20:33   100.572    1 
3 2014-09-01 18:20:05   100.561    0
4 2014-09-01 19:19:38   100.523    0
5 2014-09-01 20:19:11   100.511    1    
6 2014-09-01 21:18:43   100.534    1

PA_2: n=914364
       time              pa        wifi
1 2014-09-01 15:25:05   NA         1 
2 2014-09-01 15:25:09   100.798    1
3 2014-09-01 15:25:11   100.792    0              
4 2014-09-01 15:25:15   100.791    0              
5 2014-09-01 15:25:18   100.790    1             
6 2014-09-01 15:25:20   100.791    1  

PA_3 n=963527
       time              pa        wifi
1 2014-09-01 15:25:02   100.832    1
2 2014-09-01 15:25:05   100.832    1
3 2014-09-01 15:25:08   100.825    0
4 2014-09-01 15:25:11   100.831    0
5 2014-09-01 15:25:14   100.830    1
6 2014-09-01 15:25:17   100.836    1   

PA_4: n = 1061117
       time              pa        wifi
1 2014-09-01 15:25:00   100.690    1
2 2014-09-01 15:25:04   100.683    1
3 2014-09-01 15:25:07   100.685    0
4 2014-09-01 15:25:11   100.687    0
5 2014-09-01 15:25:14   100.682    1
6 2014-09-01 15:25:18   100.684    1       

Also, a dichotomous variable "wifi" was added to the df to denote when wifi was on or off during the experiment.Two of the sensors were exposed to wifi while two were outside of the wifi signal. I would like to display this in a graph as well. Perhaps by shading the region or increasing the size of the lines when wifi was on during the experiment, but I am not too sure how to do this. To illustrate this, I edited the middle 2 wifi entries in the examples, but wifi was on for periods of 10 days at a time, not a few seconds.

Thanks

edit: added examples of each df and added a few explinations

解决方案

It's not totally clear to me what you are asking, but (if this what you are trying to do) you can combine the data.frames then plot them all on one chart, using color to differentiate sensors, and alpha/shape settings to differentiate wifi status. Then it's no problem that the series start and end at different times, and have different measurement resolutions.

Something like this:

library(ggplot2)
ggplot(dat, 
       aes(x=time, y=pa, group=sensor,  
           color=factor(sensor),  alpha=factor(wifi))) +
  geom_point(aes(shape=factor(wifi)), size=3) +
  geom_line() +
  scale_alpha_manual(values=c(.3, 1))

Which (using totally random data) looks like this:

To generate random data, I did this:

library(lubridate)

# fake data
set.seed(123)
n <- 40

dat <-
  data.frame(sensor=sample(1:4, n, replace=T),
             hr=sample(1:24, n, replace=T), 
             min=sample(1:60, n, replace=T),
             sec=sample(1:60, n, replace=T),
             wifi=rbinom(n, 1, .5),
             pa=100+rnorm(n))

dat$time <- with(dat, ymd_hms(paste('2014-09-01', 
                                    paste(hr, min, sec, sep=':'))))

这篇关于ggplot不等时间的多个时间序列的文章就介绍到这了,希望我们推荐的答案对大家有所帮助,也希望大家多多支持IT屋!

查看全文
登录 关闭
扫码关注1秒登录
发送“验证码”获取 | 15天全站免登陆