计算 Pandas 数据帧中的动态时间扭曲距离 [英] Calculating Dynamic Time Warping Distance in a Pandas Data Frame
问题描述
我想计算数据帧中的动态时间扭曲 (DTW) 距离.结果必须是一个新的数据帧(距离矩阵),其中包括每行之间的成对 dtw 距离.
对于欧几里得距离,我使用以下代码:
from scipy.spatial.distance import pdist, squareformeuclidean_dist = squareform(pdist(sample_dataframe,'euclidean'))
我需要一个类似的 DTW 代码.
提前致谢.
有多种方法可以做到这一点.我将在下面留下两个选项.
如果你想知道欧几里得距离和 DTW 之间的区别,
I want to calculate Dynamic Time Warping (DTW) distances in a dataframe. The result must be a new dataframe (a distance matrix) which includes the pairwise dtw distances among each row.
For Euclidean Distance I use the following code:
from scipy.spatial.distance import pdist, squareform
euclidean_dist = squareform(pdist(sample_dataframe,'euclidean'))
I need a similar code for DTW.
Thanks in advance.
There are various ways one might do that. I'll leave two options bellow.
In case one wants to know the difference between the euclidean distance and DTW, this is a good resource.
Option 1
Using fastdtw
.
Install it with
pip install fastdtw
Then use it as following
import numpy as np from scipy.spatial.distance import euclidean
from fastdtw import fastdtw
x = np.array([[1,1], [2,2], [3,3], [4,4], [5,5]])
y = np.array([[2,2],
[3,3], [4,4]])
distance, path = fastdtw(x, y, dist=euclidean)
print(distance)
Option 2 (Source)
def dtw(s, t):
n, m = len(s), len(t)
dtw_matrix = np.zeros((n+1, m+1))
for i in range(n+1):
for j in range(m+1):
dtw_matrix[i, j] = np.inf
dtw_matrix[0, 0] = 0
for i in range(1, n+1):
for j in range(1, m+1):
cost = abs(s[i-1] - t[j-1])
# take last min from a square box
last_min = np.min([dtw_matrix[i-1, j], dtw_matrix[i, j-1], dtw_matrix[i-1, j-1]])
dtw_matrix[i, j] = cost + last_min
return dtw_matrix
It works like the following
这篇关于计算 Pandas 数据帧中的动态时间扭曲距离的文章就介绍到这了,希望我们推荐的答案对大家有所帮助,也希望大家多多支持IT屋!