如何在神经网络上训练并制作序列化特征向量? [英] How to train on and make a serialized feature vector for a Neural Network?

查看:290
本文介绍了如何在神经网络上训练并制作序列化特征向量?的处理方法,对大家解决问题具有一定的参考价值,需要的朋友们下面随着小编来一起学习吧!

问题描述

通过序列化,我的意思是输入的值以离散的时间间隔出现,并且向量的大小在使用前也是未知的. 常规上,神经网络使用固定大小平行输入神经元和固定大小平行输出神经元.

可以在语音识别中使用序列化的实现,其中我可以向网络馈入波形的时间序列,并在输出端获得音素.

如果有人可以指出一些现有的实现,那就太好了.

解决方案

简单的神经网络由于结构在时间尺度上不具有不变性,因此将其应用于识别时间序列是不切实际的.为了识别时间序列,通常使用通用通信模型(HMM). NN可以与HMM一起使用以对单个语音帧进行分类.在这种HMM-ANN配置中,将音频按帧分割,将帧切片传递到ANN中以计算音素概率,然后使用HMM动态搜索对整个概率序列进行最佳匹配分析.

HMM-ANN系统通常需要从更强大的HMM-GMM系统进行初始化,因此没有独立的HMM-ANN实现,通常它们是整个语音识别工具包的一部分.在流行的工具包中, Kaldi 已针对HMM-ANN甚至HMM-DNN(深度神经网络)实现. >

也有专门用于对时间序列进行分类的神经网络-递归神经网络,它们可以成功地用于对语音进行分类.可以使用任何支持RNN的工具包创建该示例,例如 Keras .如果您想从递归神经网络开始,请尝试长期短期记忆网络(LSTM),它们的体系结构可实现更稳定的训练. 为LSTM二进制分类构建语音数据集

By serialized i mean that the values for an input come in discrete intervals of time and that size of the vector is also not known before hand. Conventionally the neural networks employ fixed size parallel input neurons and fixed size parallel output neurons.

A serialized implementation could be used in speech recognition where i can feed the network with a time series of the waveform and on the output end get the phonemes.

It would be great if someone can point out some existing implementation.

解决方案

Simple neural network as a structure doesn't have invariance across time scale deformation that's why it is impractical to apply it to recognize time series. To recognize time series usually a generic communication model is used (HMM). NN could be used together with HMM to classify individual frames of speech. In such HMM-ANN configuration audio is split on frames, frame slices are passed into ANN in order to calculate phoneme probabilities and then the whole probability sequence is analyzed for a best match using dynamic search with HMM.

HMM-ANN system usually requires initialization from more robust HMM-GMM system thus there are no standalone HMM-ANN implementation, usually they are part of a whole speech recognition toolkit. Among popular toolkits Kaldi has implementation for HMM-ANN and even for HMM-DNN (deep neural networks).

There are also neural networks which are designed to classify time series - recurrent neural networks, they can be successfully used to classify speech. The example can be created with any toolkit supporting RNN, for example Keras. If you want to start with recurrent neural networks, try long-short term memory networks (LSTM), their architecture enables more stable training. Keras setup for speech recognition is discussed in Building Speech Dataset for LSTM binary classification

这篇关于如何在神经网络上训练并制作序列化特征向量?的文章就介绍到这了,希望我们推荐的答案对大家有所帮助,也希望大家多多支持IT屋!

查看全文
登录 关闭
扫码关注1秒登录
发送“验证码”获取 | 15天全站免登陆