机器学习的周期性数据(像度角-> 179与-179有2个不同) [英] Periodic Data with Machine Learning (Like Degree Angles -> 179 is 2 different from -179)

查看:143
本文介绍了机器学习的周期性数据(像度角-> 179与-179有2个不同)的处理方法,对大家解决问题具有一定的参考价值,需要的朋友们下面随着小编来一起学习吧!

问题描述

我正在使用Python进行核密度估计和高斯混合模型,以对多维数据样本的似然度进行排名. 每个数据都是一个角度,我不确定如何为机器学习处理角度数据的周期性.

I'm using Python for kernel density estimations and gaussian mixture models to rank likelihood of samples of multidimensional data. Every piece of data is an angle, and I'm not sure how to handle the periodicity of angular data for machine learning.

首先,我通过将所有负角添加360来消除了所有负角,因此所有负角都变为正角,-179变为181.我认为,这很好地处理了-179的情况,相似之处与179和相似之处没有显着不同,但它无法处理类似359之类的实例.

First I removed all negative angles by adding 360 to them, so all angles that were negative became positive, -179 becoming 181. I believe this elegantly handles the case of -179 an similar being not significantly different than 179 and similar, but it does not handle instances like 359 being not dissimilar from 1.

我想解决该问题的一种方法是同时保持负值和负值+360,并使用两者中的最小值,但这需要修改机器学习算法.

One way I've thought of approaching the issue is keeping both negative and negative+360 values and using the minimum of the two, but this would require modification of the machine learning algorithms.

是否有一个很好的仅用于预处理的解决方案? scipy或scikit中内置了什么吗?

Is there a good preprocessing-only solution to this problem? Anything built into scipy or scikit?

谢谢!

推荐答案

正如Tal Darom在评论中所写,在将弧度归一化后,您可以将每个周期性特征x替换为两个特征cos(x)sin(x).解决了359≈1问题:

As Tal Darom wrote in the comments, you can replace every periodic feature x with two features cos(x) and sin(x) after normalizing to radians. That solves the 359 ≈ 1 problem:

>>> def fromdeg(d):
...     r = d * np.pi / 180.
...     return np.array([np.cos(r), np.sin(r)])
... 
>>> np.linalg.norm(fromdeg(1) - fromdeg(359))
0.03490481287456796
>>> np.linalg.norm(fromdeg(1) - fromdeg(180))
1.9999238461283426
>>> np.linalg.norm(fromdeg(90) - fromdeg(270))
2.0

norm(a - b)是向量ab之间的良好旧欧几里得距离.您可以使用简单的图或通过意识到这些(cos,sin)对确实是单位圆上的坐标来进行验证,因此,这些(cos,sin)向量中的两个向量之间的距离是最大的(点积最小)当原始角度相差180°时.

norm(a - b) is the good old Euclidean distance between vectors a and b. As you can verify using a simple plot, or by realizing that these (cos,sin) pairs are really coordinates on the unit circle, that this distance is maximal (and the dot product minimal) between two of these (cos,sin) vectors when the original angles differ by 180°.

这篇关于机器学习的周期性数据(像度角-> 179与-179有2个不同)的文章就介绍到这了,希望我们推荐的答案对大家有所帮助,也希望大家多多支持IT屋!

查看全文
登录 关闭
扫码关注1秒登录
发送“验证码”获取 | 15天全站免登陆