简单的语音识别方法 [英] simple speech recognition methods

查看：94 发布时间：2020/5/17 19:15:48 neural-network speech-recognition hidden-markov-models

本文介绍了简单的语音识别方法的处理方法，对大家解决问题具有一定的参考价值，需要的朋友们下面随着小编来一起学习吧！

问题描述

是的，我知道语音识别相当复杂(轻描淡写).我正在寻找一种区分也许 20-30个短语的方法.具有分词功能(可以使用离散语音)会很不错，但这不是必需的.该软件将取决于用户(即供我使用).我不是在寻找现有的软件，而是寻找一种自己进行此操作的好方法.我研究了各种现有方法，将声音分割成音素似乎很普遍，但对于我的需求来说却有些过分.

Yes, I'm aware that speech recognition is fairly complicated (as an understatement). What I'm looking for is a method for distinguishing between maybe 20-30 phrases. An ability to split words (discrete speech is fine) would be nice, but isn't required. The software will be user-dependent(i.e. for use by me). I'm not looking for existing software, but for a good way of going about doing this myself. I've looked into various existing methods and it seems like splitting the sound into phonemes, while common, is somewhat excessive for my needs.

对于某些情况，我只是在寻找一种通过一些简单的语音命令来控制计算机某些方面的方法.我知道Windows已经有语音识别软件，但是我想自己做一个学习练习.命令很简单，例如打开Google"或静音".我想到的(不确定这是否是个好主意)是某些命令会变得复杂.因此，静音"将只是静音".而打开"命令可以单独识别，然后具有其后缀(Google，Photoshop等).被其他网络/模型/其他识别.但是我不确定以这种方式查找前缀/断字是否会比不必处理数量更多的单个命令会产生更好的结果.

For some context, I'm just looking for a way to control some aspects of my computer with a few simple voice commands. I'm aware that Windows already has speech recognition software, but I'd like to go about this one myself as a learning exercise. Commands would be simple like "Open Google", or "Mute". What I had in mind (not sure if this is a good idea) is that some commands would be compound. So "Mute" would just be "Mute". Whereas the "Open" command could be recognized individually, and then have its suffixes (Google, Photoshop, etc). recognized with another network/model/whatever. But I'm not sure if looking for prefixes/word breaks in this way would produce better results than having to deal with an increased number of individual commands.

我一直在研究感知器，hopfield网络(尽管它们对我的理解有些过时)和HMM，尽管我理解了这些概念背后的想法(我之前已经实现了ANN)，但我并没有真正知道最适合此任务的.我以为线性矢量量化模型也很合适，但为此目的我找不到太多文献.任何指导/资源将不胜感激.

I've been looking into perceptrons, hopfield networks (though they're somewhat obsolete from what I understand) and HMMs, and while I understand the ideas behind these (I've implemented the ANNs before) I don't really know which is best suited to this task. I'm assuming that linear vector quantization models would also be appropriate, but I can't really find much literature to this end. Any guidance/resources would be greatly appreciated.

简单的语音识别方法 [英] simple speech recognition methods

问题描述

推荐答案

相关文章

其他开发最新文章

热门教程

热门工具

登录关闭

简单的语音识别方法 [英] simple speech recognition methods

问题描述

推荐答案

相关文章

其他开发最新文章

热门教程

热门工具

登录 关闭

登录关闭