R,查找,日期,连续 [英] R, find, dates, consecutive

查看:111
本文介绍了R,查找,日期,连续的处理方法,对大家解决问题具有一定的参考价值,需要的朋友们下面随着小编来一起学习吧!

问题描述

我的工作是在某个特定阈值以下的数据框中查找连续的值。
首先,我从数据框中提取了一个低于阈值的子集。现在我的数据如下所示:

My job is to find consecutive values in a dataframe beneath a certain threshold. First I have a extracted a subset from a dataframe with values lower than the threshold. Now my data looks like this:

Value       dates
5105.47     1970-03-25
5398.53     1970-04-08
5520.65     1970-04-09
5052.68     1970-04-10
5406.77     1970-04-11
5501.05     1970-04-12

结果基本上是不规则的时间序列。现在,我想确定连续的日期。

The result is basically an irregular time series. Now I would like to identify the consecutive dates. Any guesses on how to do it?

推荐答案

您可以尝试

df1$consecutive <- c(NA,diff(as.Date(df1$dates))==1)
# > df1
#     Value      dates consecutive
# 1 5105.47 1970-03-25          NA
# 2 5398.53 1970-04-08       FALSE
# 3 5520.65 1970-04-09        TRUE
# 4 5052.68 1970-04-10        TRUE
# 5 5406.77 1970-04-11        TRUE
# 6 5501.05 1970-04-12        TRUE

通过将字符串转换为 Date 格式,可以执行简单的操作,例如将两者之间的差取日期。函数 diff()将一个向量作为输入,并计算该向量的每个条目 v [i] 之间的差及其上一个条目 v [i-1] 。差向量显然比原始向量少一个入口。由于无法确定data.frame中的第一个日期是否是连续的日期,因此可以合理地将其标识符设置为 NA

By converting the character strings into Date format it becomes possible to perform simple operations like taking the difference between two dates. The function diff() takes a vector as input and computes the difference between each entry v[i] of the vector and its previous entry v[i-1]. The difference vector has obviously one entry less than the original vector. Since it is impossible to determine whether the first date in the data.frame is a consecutive one or not, its identifier can reasonably be set to NA.

对于日期,如果差异等于 1 ,则该天是连续的,而比较 diff( as.Date(df1 $ dates))== 1)的值为 TRUE

In the case of dates, if the difference is equal to 1 the days are consecutive and the comparison diff(as.Date(df1$dates))==1) evaluates to TRUE.

数据

df1 <- structure(list(Value = c(5105.47, 5398.53, 5520.65, 5052.68, 
            5406.77, 5501.05), dates = structure(1:6, .Label = c("1970-03-25", 
            "1970-04-08", "1970-04-09", "1970-04-10", "1970-04-11", "1970-04-12"),
            class = "factor")), .Names = c("Value", "dates"), 
            class = "data.frame", row.names = c(NA, -6L))

这篇关于R,查找,日期,连续的文章就介绍到这了,希望我们推荐的答案对大家有所帮助,也希望大家多多支持IT屋!

查看全文
登录 关闭
扫码关注1秒登录
发送“验证码”获取 | 15天全站免登陆