如何在R中的多个时间序列上应用dtw算法? [英] How to apply dtw algorithm on multiple time series in R?

查看：83 发布时间：2021/5/2 20:51:11 r dplyr dtw

本文介绍了如何在R中的多个时间序列上应用dtw算法?的处理方法，对大家解决问题具有一定的参考价值，需要的朋友们下面随着小编来一起学习吧！

问题描述

我有不同车辆速度的时间序列.我的最终目标是根据不同车辆在速度上的相似性来对它们进行聚类.因此，我基本上需要生成一个距离矩阵，其中每个单元格包含一对车速时间序列之间的距离.我想使用动态时间规整(dtw)作为距离指标.因此，我要在每对速度时间序列上应用dtw .

I have time series of speed of different vehicles. My ultimate objective is to cluster different vehicles based on their similarities in speed over time. So, I basically need to produce a distance matrix where each cell contains the distance between a pair of vehicle speed time series. I want to use Dynamic Time Warping (dtw) as distance metric. Therefore, I want to apply dtw on each pair of speed time series.

这里有一些样本数据，每辆车仅包含8个观察值，而只有3辆车:

Here are some sample data that contain only 8 observations per car and only 3 cars:

> dput(c)
structure(list(file.ID2 = c("Cars_03", "Cars_03", "Cars_03", 
"Cars_03", "Cars_03", "Cars_03", "Cars_03", "Cars_03", "Cars_04", 
"Cars_04", "Cars_04", "Cars_04", "Cars_04", "Cars_04", "Cars_04", 
"Cars_04", "Cars_05", "Cars_05", "Cars_05", "Cars_05", "Cars_05", 
"Cars_05", "Cars_05", "Cars_05"), speed.kph.ED = c(129.3802848, 
129.4022304, 129.424176, 129.4461216, 129.4680672, 129.47904, 
129.5009856, 129.5229312, 127.8770112, 127.8221472, 127.7672832, 
127.7124192, 127.6575552, 127.6026912, 127.5478272, 127.4929632, 
134.1095616, 134.1205344, 134.1315072, 134.1534528, 134.1644256, 
134.1753984, 134.1863712, 134.197344)), row.names = c(NA, -24L
), class = c("tbl_df", "tbl", "data.frame"), .Names = c("file.ID2", 
"speed.kph.ED"))

我尝试过的

我可以找到一对的 dtw :: dtw()距离，如下所示:

    library(dplyr) 
    library(dtw) 
    c3 <- c %>% filter(file.ID2=="Cars_03")  
    c4 <- c %>% filter(file.ID2=="Cars_04")  
    query <- c4$speed.kph.ED  
    reference <- c3$speed.kph.ED  
    dtw_results <- dtw(x = query, y = reference)
    dtw_results$distance

但是我的问题是:有没有一种方法可以自动找到每对之间的 dtw()$ distance 并生成距离矩阵?在此示例中，这意味着这些对:

But my question is : Is there a way to automatically find the dtw()$distance between each pair and generate a distance matrix? In this example, it means these pairs:

Cars_03-Cars_03
Cars_03-Cars_04
Cars_03-Cars_05
Cars_04-Cars_03
Cars_04-Cars_04
Cars_04-Cars_05
等等

Cars_03 - Cars_03
Cars_03 - Cars_04
Cars_03 - Cars_05
Cars_04 - Cars_03
Cars_04 - Cars_04
Cars_04 - Cars_05
and so on

我知道 for循环是执行此操作的一种方法.但是，由于 dtw 本身需要大量RAM，因此 for循环可以进一步减慢该过程.还有其他选择吗?如果这是一个愚蠢的问题，我感到抱歉，但是我对使用 dtw 很陌生.

I know for loop is one way to do this. But since dtw itself requires a lot of RAM, for loop can further slow down the process. Any alternatives? I'm sorry if this is a silly question but I'm quite new to using dtw.

推荐答案

以下作品

通过 file.ID2

ds <- split(df, df$file.ID2)

使用 expand.grid 组合您的姓名， file.ID2 和您的值

Use expand.grid to make all combinations of your names, file.ID2 and your values

Names <- expand.grid(unique(df$file.ID2), unique(df$file.ID2))
Values <- expand.grid(ds, ds)

purrr:map_dbl 遍历 Values 的所有行组合并返回双精度向量

purrr:map_dbl iterates through all row-combinations of Values and returns a vector of doubles

library(dtw)
library(purrr)
Dist <- map_dbl(1:nrow(Values), ~dtw(x = Values[.x,]$Var1[[1]]$speed.kph.ED, y = Values[.x,]$Var2[[1]]$speed.kph.ED)$distance)

绑定名称答案

library(dplyr)
ans <- Names %>% 
          mutate(distance = Dist)

输出

     Var1    Var2 distance
1 Cars_03 Cars_03  0.00000
2 Cars_04 Cars_03 25.66538
3 Cars_05 Cars_03 69.72117
4 Cars_03 Cars_04 25.66538
5 Cars_04 Cars_04  0.00000
6 Cars_05 Cars_04 96.00103
7 Cars_03 Cars_05 69.72117
8 Cars_04 Cars_05 96.00103
9 Cars_05 Cars_05  0.00000

这篇关于如何在R中的多个时间序列上应用dtw算法?的文章就介绍到这了，希望我们推荐的答案对大家有所帮助，也希望大家多多支持IT屋！

查看全文

如何在R中的多个时间序列上应用dtw算法? [英] How to apply dtw algorithm on multiple time series in R?

问题描述

我尝试过的

推荐答案

相关文章

其他开发最新文章

热门教程

热门工具

登录关闭

如何在R中的多个时间序列上应用dtw算法? [英] How to apply dtw algorithm on multiple time series in R?

问题描述

我尝试过的

推荐答案

相关文章

其他开发最新文章

热门教程

热门工具

登录 关闭

登录关闭