R studio:具有超过 1 个感兴趣变量的时间序列的动态时间包装 [英] R studio: Dynamic Time Wrapping for time series with more than 1 variable of interest

查看:53
本文介绍了R studio:具有超过 1 个感兴趣变量的时间序列的动态时间包装的处理方法,对大家解决问题具有一定的参考价值,需要的朋友们下面随着小编来一起学习吧!

问题描述

这个问题与这篇文章有关:如何在多个时间序列上应用 dtw 算法R?

This question is related to this post: How to apply dtw algorithm on multiple time series in R?

原始帖子的数据帧仅包含 1 个感兴趣的变量:speed.kph.ED.

The original post has a dataframe that consists of only 1 variable in interest: speed.kph.ED.

#data: 8 observations, 3 cars 
file.ID2 <- c("Cars_03", "Cars_03", "Cars_03", 
              "Cars_03", "Cars_03", "Cars_03", "Cars_03", "Cars_03", "Cars_04", 
              "Cars_04", "Cars_04", "Cars_04", "Cars_04", "Cars_04", "Cars_04", 
              "Cars_04", "Cars_05", "Cars_05", "Cars_05", "Cars_05", "Cars_05", 
              "Cars_05", "Cars_05", "Cars_05")
speed.kph.ED <- c(129.3802848, 
                  129.4022304, 129.424176, 129.4461216, 129.4680672, 129.47904, 
                  129.5009856, 129.5229312, 127.8770112, 127.8221472, 127.7672832, 
                  127.7124192, 127.6575552, 127.6026912, 127.5478272, 127.4929632, 
                  134.1095616, 134.1205344, 134.1315072, 134.1534528, 134.1644256, 
                  134.1753984, 134.1863712, 134.197344)

df <- data.frame(file.ID2, speed.kph.ED)
df

根据已接受答案的建议,以下是使用 dtw 计算 3 辆车(3 个时间序列)之间距离的程序:

Per suggested by the accepted answer, here is the procedure to calculate the distance between 3 cars (3 time series) using dtw:

library(dtw)
library(purrr)
library(dplyr)

# Split your data frame into a list by file.ID2
ds <- split(df, df$file.ID2)
ds

# Use expand.grid to make all combinations of your names, file.ID2 and your values
Names <- expand.grid(unique(df$file.ID2), unique(df$file.ID2))
Values <- expand.grid(ds, ds)

# purrr:map_dbl iterates through all row-combinations of Values and returns a vector of doubles
Dist <- map_dbl(1:nrow(Values), ~dtw(x = Values[.x,]$Var1[[1]]$speed.kph.ED, y = Values[.x,]$Var2[[1]]$speed.kph.ED)$distance)

# Bind answer to Names
library(dplyr)
ans <- Names %>% 
  mutate(distance = Dist)

ans

我想知道在计算 3 辆车(3 个时间序列)之间的距离时,如果我还有另外两个变量要考虑,该怎么办?

I am wondering what if I have another two variables that I also want to take into consideration when calculating the distance between 3 cars (3 time series)?

例如,假设我还有另外 2 个变量 score.kph.EDrating.kph.ED:

For example, let's say I have another 2 variables score.kph.ED and rating.kph.ED:

score.kph.ED <- c(1:24)
rating.kph.ED <- c(25:48)


df <- data.frame(file.ID2, speed.kph.ED, score.kph.ED, rating.kph.ED)
df

现在,3辆车之间的距离不仅基于speed.kph.ED,还基于score.kph.EDrating.kph.ED.

Now, the distance between the 3 cars are calculated not only based on speed.kph.ED, but also based on score.kph.ED and rating.kph.ED.

如何修改现有代码以实现此目标?

How can I modify the existing code so that I can achieve this goal?

非常感谢您的帮助!

推荐答案

你要做的就是所谓的多元 DTW,你可以使用 proxy 包来简化事情.检查这个其他答案,但你基本上可以做你想做的事(使用你的例子中的变量):

What you're trying to do is called multivariate DTW, and you can simplify things by using the proxy package. Check this other answer, but you can essentially do what you want like this (using the variables from your example):

proxy::dist(lapply(ds, function(x) { x[, -1L] }), method = "dtw")

这篇关于R studio:具有超过 1 个感兴趣变量的时间序列的动态时间包装的文章就介绍到这了,希望我们推荐的答案对大家有所帮助,也希望大家多多支持IT屋!

查看全文
登录 关闭
扫码关注1秒登录
发送“验证码”获取 | 15天全站免登陆