时间序列距离度量 [英] Time series distance metric

查看:387
本文介绍了时间序列距离度量的处理方法,对大家解决问题具有一定的参考价值,需要的朋友们下面随着小编来一起学习吧!

问题描述

为了聚类一组时间序列,我正在寻找一种智能的距离度量标准. 我尝试了一些众所周知的指标,但没有一个适合我的情况.

In order to clusterize a set of time series I'm looking for a smart distance metric. I've tried some well known metric but no one fits to my case.

ex:假设我的集群算法提取了这三个质心[s1,s2,s3]:

ex: Let's assume that my cluster algorithm extracts this three centroids [s1, s2, s3]:

我想将这个新示例[sx]放在最相似的集群中:

I want to put this new example [sx] in the most similar cluster:

最相似的质心是第二个质心,因此我需要找到一个距离函数d,它赋予我d(sx, s2) < d(sx, s1)d(sx, s2) < d(sx, s3)

The most similar centroids is the second one, so I need to find a distance function d that gives me d(sx, s2) < d(sx, s1) and d(sx, s2) < d(sx, s3)

修改

在这里使用度量[余弦,欧几里得,明可夫斯基,动态类型翘曲] ]

Here the results with metrics [cosine, euclidean, minkowski, dynamic type warping] ]3

修改2

用户Pietro P建议将距离应用于时间序列的累积版本 该解决方案有效,这里是曲线图和度量标准:

User Pietro P suggested to apply the distances on the cumulated version of the time series The solution works, here the plots and the metrics:

推荐答案

很好的问题!在这些时间序列上使用R ^ n的任何标准距离(欧几里得,曼哈顿或通常的minkowski)都无法获得所需的结果,因为这些度量标准与R ^ n坐标的排列无关(而时间是严格排序的,并且是您要捕捉的现象.

nice question! using any standard distance of R^n (euclidean, manhattan or generically minkowski) over those time series cannot achieve the result you want, since those metrics are independent of the permutations of the coordinate of R^n (while time is strictly ordered and it is the phenomenon you want to capture).

一个简单的窍门,可以使用时间序列的累积版本(随时间增加的时间总和)来执行您所要求的,然后应用标准指标. 使用曼哈顿度量标准,您将获得两个时间序列之间的距离,即两个时间序列之间的累积版本之间的距离.

A simple trick, that can do what you ask is using the cumulated version of the time series (sum values over time as time increases) and then apply a standard metric. Using the Manhattan metric, you would get as a distance between two time series the area between their cumulated versions.

这篇关于时间序列距离度量的文章就介绍到这了,希望我们推荐的答案对大家有所帮助,也希望大家多多支持IT屋!

查看全文
登录 关闭
扫码关注1秒登录
发送“验证码”获取 | 15天全站免登陆