将神经网络应用于可变长度语音段的MFCC [英] Applying neural network to MFCCs for variable-length speech segments

查看：265 发布时间：2020/5/17 19:25:37 matlab neural-network speech-recognition mfcc

本文介绍了将神经网络应用于可变长度语音段的MFCC的处理方法，对大家解决问题具有一定的参考价值，需要的朋友们下面随着小编来一起学习吧！

问题描述

我目前正在尝试创建和训练神经网络，以使用MFCC进行简单的语音分类.

I'm currently trying to create and train a neural network to perform simple speech classification using MFCCs.

此刻，我为每个样本使用26个系数，总共使用5个不同的类别-这些是五个不同的单词，其音节数量不同.

At the moment, I'm using 26 coefficients for each sample, and a total of 5 different classes - these are five different words with varying numbers of syllables.

虽然每个样本的时长为2秒，但我不确定如何处理用户可以非常缓慢或非常快速地发音的情况.例如，一秒钟内说出的电视"一词与两秒钟内说出的该词产生不同的系数.

While each sample is 2 seconds long, I am unsure how to handle cases where the user can pronounce words either very slowly or very quickly. E.g., the word 'television' spoken within 1 second yields different coefficients than the word spoken within two seconds.

任何有关如何解决此问题的建议将不胜感激！

Any advice on how I can solve this problem would be much appreciated!

推荐答案

我目前正在尝试创建和训练神经网络，以使用MFCC进行简单的语音分类.

I'm currently trying to create and train a neural network to perform simple speech classification using MFCCs.

简单的神经网络没有输入长度不变性，也不允许分析时间序列.

Simple neural networks do not have input lenght invariance and do not allow to analyze time series.

对于时间序列的分类(如一系列MFCC帧)，可以使用具有时间不变性的分类器.例如，您可以使用结合了隐马尔可夫模型(ANN-HMM)的神经网络，结合了隐马尔可夫模型的高斯混合模型(GMM-HMM)或递归神经网络(RNN).用于RNN的Matlab实现可此处. Theano实现也可以可用.您可以在Google中找到这些结构的详细说明.

For classification of time series like a series of MFCC frames you can use a classifier with time invariance. For example you can use neural networks combined with hidden Markov models (ANN-HMM), gaussian mixture model with hidden markov models (GMM-HMM) or recurrent neural networks (RNN). Matlab implementation for RNN is here. Theano implementation is also available. You can find a detailed description of those structures in Google.

语音识别并不是一件容易的事情，最好使用现有的软件，例如 CMUSphinx

Speech recognition is not a simple thing to implement, it is better to use existing software like CMUSphinx

这篇关于将神经网络应用于可变长度语音段的MFCC的文章就介绍到这了，希望我们推荐的答案对大家有所帮助，也希望大家多多支持IT屋！

查看全文

将神经网络应用于可变长度语音段的MFCC [英] Applying neural network to MFCCs for variable-length speech segments

问题描述

推荐答案

相关文章

其他开发最新文章

热门教程

热门工具

登录关闭

将神经网络应用于可变长度语音段的MFCC [英] Applying neural network to MFCCs for variable-length speech segments

问题描述

推荐答案

相关文章

其他开发最新文章

热门教程

热门工具

登录 关闭

登录关闭