在声音识别中进行离群值检测的方法? [英] Methods to do outlier detection in sound recognition?

查看:116
本文介绍了在声音识别中进行离群值检测的方法?的处理方法,对大家解决问题具有一定的参考价值,需要的朋友们下面随着小编来一起学习吧!

问题描述

有些模型可以识别2级声音,即A级和B级.

There are models to recognize 2-class sounds, which are class-A and class-B.

如何将C类声音识别为异常声音?

How to recognize class-C sounds as abnormal sound ?

我试图在按帧识别时设置一个阈值.

I tried to set a threshold while recognizing by frames.

above 70% -> class A or B
else      -> abnormal

例如,

如果声音有10帧,则结果为

If a sound has 10 frames, and the result is

frame 1 2 3 4 5 6 7 8 9 10
      A B A B A A A B A  A     A=7 B=3
-> class A

frame 1 2 3 4 5 6 7 8 9 10
      B B A B A A A B A  A     A=6 B=4
-> abnormal

性能很差.

我该怎么办?

推荐答案

可以通过两种方式查看:分类问题和异常检测问题.

There are two ways to look at this: as a classification problem, and as an outlier detection problem.

分类

作为分类问题,可能会引入系统应用程序中可能会遇到的外部声音,并使用这些声音来创建第三类.对于这种第三类,拥有各种各样的声音,甚至可能有大量的声音,这一点很重要.

As a classification problem, it would be possible to bring in outside sounds which may be encountered in the application of your system and use that to create a third class. It is important for this third class to have a large variety of sounds, and potentially a large number of them.

因此,您可能希望使用成本敏感的产品,而不是全部使用,因此请调整精度/召回率以挑选A级和B级.

With this, you may want to use cost sensitive one vs all so adjust the precision / recall for picking out classes A and B.

此方法的好处是您不必为异常值/异常模型设置任意阈值.在这种情况下,距离可能很难测量,因此很难找到合适的阈值.

The benefit of this method is you do not have to set an arbitrary threshold for an outlier / anomaly model. Distance may be hard to measure in this context, so finding a proper threshold could be difficult.

很多人,包括我自己在内,都在一场kaggle比赛中使用了这种技术,这与您的问题类似. https://www.kaggle.com/c/axa-driver-telematics-analysis

Many people, including myself used this technique on a kaggle competition which is similar to your problem. https://www.kaggle.com/c/axa-driver-telematics-analysis

异常值/异常检测

由于您使用的是神经网络,因此有可能构建一个自动编码器.这将找到代表您要检测的声音的多种声音.您可以将重建损失用作异常检测的距离度量.这仍然需要您确定一个阈值,最好使用一些现有的异常/异常数据来做到这一点.

Since you are using a neural network, it could be possible to build an autoencoder. This will find a manifold of sounds which represent the sounds you are trying to detect. You could use the reconstruction loss as your distance measure for anomaly detection. This will still require you determine a threshold, and it is good to use some existing anomaly / outlier data to do this.

这篇关于在声音识别中进行离群值检测的方法?的文章就介绍到这了,希望我们推荐的答案对大家有所帮助,也希望大家多多支持IT屋!

查看全文
登录 关闭
扫码关注1秒登录
发送“验证码”获取 | 15天全站免登陆