从头开始简单的语音识别 [英] Simple speech recognition from scratch

查看：84 发布时间：2021/5/6 20:31:56 speech-recognition feature-extraction hidden-markov-models

本文介绍了从头开始简单的语音识别的处理方法，对大家解决问题具有一定的参考价值，需要的朋友们下面随着小编来一起学习吧！

问题描述

我发现与我的问题最相似的问题是这个(简单的语音识别方法)，但由于已经过去3年了，所以我问的答案还不够.

The most alike question I found related to my question is this (simple speech recognition methods) but since had passed 3 years and the answers are not enough I will ask.

我想从头开始计算一个简单的语音识别系统，我只需要识别五个单词即可.据我所知，此应用程序最常用的音频功能是MFCC和用于分类的HMM.

I want to compute, from scratch, a simple speech recognition system, I only need to recognize five words. As much as I know, the more used audio features for this application are the MFCC, and HMM for classification.

我能够从音频中提取MFCC，但对于如何使用这些功能通过HMM生成模型然后进行分类，我仍然存有疑问.

I'm able to extract the MFCC from audio but I still have some doubts about how to use the features for generating a model with HMM and then perform classification.

据我了解，我必须执行矢量量化.首先，我需要有一堆MFCC向量，然后应用聚类算法来获取质心.然后，使用质心执行矢量量化，这意味着我必须比较每个MFCC矢量，并以最相似的质心名称对其进行标记.

As I understand, I have to perform vector quantization. First I need to have a bunch of MFCC vectors, then apply a clustering algorithm to get centroids. Then, use the centroids to perform vector quantization, this means that I have to compare every MFCC vector and label it with the name of the centroid most alike.

然后，质心是HMM中的可观察符号".我必须将单词介绍给训练算法，并为每个单词创建一个HMM模型.然后，在进行音频查询时，我将所有模型进行比较，然后说这是概率最高的单词.

Then, the centroids are the 'observable symbols' in the HMM. I have to introduce words to the training algorithm and create a HMM model for each word. Then, given an audio query I compare with all models and I say is the word with the highest probability.

首先，此过程是否正确?然后，如何处理大小不同的单词.我的意思是，如果我训练了500ms和300ms的单词，我会引入多少个可观察符号以与所有模型进行比较?

First of all, is this procedure correct? Then, how do I deal with different sized words. I mean, If I have trained words of 500ms and 300ms, how many observable symbols do I introduce to compare with all the models?

注意:我不想使用狮身人面像，android API，microsoft API或其他库.

Note: I don't want to use sphinx, android API, microsoft API or other library.

注意2:如果您共享更多最新信息以寻求更好的技术，我将不胜感激.

Note2: I would appreciate if you share more recent information for better techniques.

从头开始简单的语音识别 [英] Simple speech recognition from scratch

问题描述

推荐答案

相关文章

其他开发最新文章

热门教程

热门工具

登录关闭

从头开始简单的语音识别 [英] Simple speech recognition from scratch

问题描述

推荐答案

相关文章

其他开发最新文章

热门教程

热门工具

登录 关闭

登录关闭