将神经网络应用于可变长度语音段的MFCC [英] Applying neural network to MFCCs for variable-length speech segments

查看:265
本文介绍了将神经网络应用于可变长度语音段的MFCC的处理方法,对大家解决问题具有一定的参考价值,需要的朋友们下面随着小编来一起学习吧!

问题描述

我目前正在尝试创建和训练神经网络,以使用MFCC进行简单的语音分类.

I'm currently trying to create and train a neural network to perform simple speech classification using MFCCs.

此刻,我为每个样本使用26个系数,总共使用5个不同的类别-这些是五个不同的单词,其音节数量不同.

At the moment, I'm using 26 coefficients for each sample, and a total of 5 different classes - these are five different words with varying numbers of syllables.

虽然每个样本的时长为2秒,但我不确定如何处理用户可以非常缓慢或非常快速地发音的情况.例如,一秒钟内说出的电视"一词与两秒钟内说出的该词产生不同的系数.

While each sample is 2 seconds long, I am unsure how to handle cases where the user can pronounce words either very slowly or very quickly. E.g., the word 'television' spoken within 1 second yields different coefficients than the word spoken within two seconds.

任何有关如何解决此问题的建议将不胜感激!

Any advice on how I can solve this problem would be much appreciated!

推荐答案

我目前正在尝试创建和训练神经网络,以使用MFCC进行简单的语音分类.

I'm currently trying to create and train a neural network to perform simple speech classification using MFCCs.

简单的神经网络没有输入长度不变性,也不允许分析时间序列.

Simple neural networks do not have input lenght invariance and do not allow to analyze time series.

对于时间序列的分类(如一系列MFCC帧),可以使用具有时间不变性的分类器.例如,您可以使用结合了隐马尔可夫模型(ANN-HMM)的神经网络,结合了隐马尔可夫模型的高斯混合模型(GMM-HMM)或递归神经网络(RNN).用于RNN的Matlab实现可此处. Theano实现也可以可用.您可以在Google中找到这些结构的详细说明.

For classification of time series like a series of MFCC frames you can use a classifier with time invariance. For example you can use neural networks combined with hidden Markov models (ANN-HMM), gaussian mixture model with hidden markov models (GMM-HMM) or recurrent neural networks (RNN). Matlab implementation for RNN is here. Theano implementation is also available. You can find a detailed description of those structures in Google.

语音识别并不是一件容易的事情,最好使用现有的软件,例如 CMUSphinx

Speech recognition is not a simple thing to implement, it is better to use existing software like CMUSphinx

这篇关于将神经网络应用于可变长度语音段的MFCC的文章就介绍到这了,希望我们推荐的答案对大家有所帮助,也希望大家多多支持IT屋!

查看全文
登录 关闭
扫码关注1秒登录
发送“验证码”获取 | 15天全站免登陆