创建自定义语音命令(的GNU / Linux) [英] Creating custom voice commands (GNU/Linux)

查看:310
本文介绍了创建自定义语音命令(的GNU / Linux)的处理方法,对大家解决问题具有一定的参考价值,需要的朋友们下面随着小编来一起学习吧!

问题描述

我要找的建议,对于个人项目。

I'm looking for advices, for a personal project.

我试图创建定制的语音命令创建一个软件。的目标是允许用户/我记录一些音频数据(2/3秒),用于定义命令/宏。然后,当用户将讲话(记录相同的音频数据),命令/宏将被执行。
该软件必须能够在处理时间少于1秒,以检测一个命令以低成本计算机(树莓派,例如)。

I'm attempting to create a software for creating customized voice commands. The goal is to allow user/me to record some audio data (2/3 secs) for defining commands/macros. Then, when the user will speak (record the same audio data), the command/macro will be executed. The software must be able to detect a command in less than 1 second of processing time in a low-cost computer (RaspberryPi, for example).

我已经搜查方式有两种:
- 语音识别(CMU-狮身人面像,朱利叶斯,西门):有很好的开源解决方案,但他们往往需要大量的数据库文件,而语音识别是不是真的什么,我试图做的。语音识别会消耗太多的权力的一个小功能。
- 音频指纹(Chromaprint - > http://acoustid.org/chromaprint ):这似乎是几乎什么我M寻找。其原理是,以从原始音频数据创建指纹,然后比较指纹以确定它们是否可以相同。然而,这种软件/库似乎是专为歌曲名称(如智能手机软件著名):我想配置一个很好的比较,但我觉得我在一个糟糕的方​​式去

I already searched in two ways : - Speech Recognition (CMU-Sphinx, Julius, simon) : There is good open-source solutions, but they often need large database files, and speech recognition is not really what I'm attempting to do. Speech Recognition could consume too much power for a small feature. - Audio Fingerprinting (Chromaprint -> http://acoustid.org/chromaprint) : It seems to be almost what I'm looking for. The principle is to create fingerprint from raw audio data, then compare fingerprints to determine if they can be identical. However, this kind of software/library seems to be designed for song identification (like famous softwares on smartphones) : I'm trying to configure a good "comparator", but I think I'm going in a bad way.

你知道code的一些专用软件或包裹在做类似的事情?

Do you know some dedicated software or parcel of code doing something similar ?

任何建议将AP preciated。

Any suggestion would be appreciated.

推荐答案

宋指纹不是该任务,因为命令时序可以改变和指纹预计确切时间匹配是一个好主意。然而,其非常容易实现与DTW算法匹配的时间序列,并与CMUSphinx库Sphinxbase提取的特征。请参阅有关DTW详情Wikipedia条目。

Song fingerprint is not a good idea for that task because command timings can vary and fingerprint expects exact time match. However its very easy to implement matching with DTW algorithm for time series and features extracted with CMUSphinx library Sphinxbase. See Wikipedia entry about DTW for details.

http://en.wikipedia.org/wiki/Dynamic_time_warping

http://cmusphinx.sourceforge.net/wiki/download

这篇关于创建自定义语音命令(的GNU / Linux)的文章就介绍到这了,希望我们推荐的答案对大家有所帮助,也希望大家多多支持IT屋!

查看全文
登录 关闭
扫码关注1秒登录
发送“验证码”获取 | 15天全站免登陆