Python 中的 Raspberry Pi 异步/连续语音识别 [英] Raspberry Pi Asynchronous/Continuous Speech Recognition in Python

查看:24
本文介绍了Python 中的 Raspberry Pi 异步/连续语音识别的处理方法,对大家解决问题具有一定的参考价值,需要的朋友们下面随着小编来一起学习吧!

问题描述

我想用 Python 为 Raspberry Pi 创建一个语音识别脚本,并且需要一个异步/连续语音识别库.异步意味着我需要无休止地运行识别,直到口语匹配到没有任何键盘输入的单词数组,然后将口语显示到终端并重新开始识别.我已经看过 PocketSphinx,但是经过几个小时的谷歌搜索,我没有找到任何关于异步识别的信息.

I want to create a speech recognition script for the Raspberry Pi in Python and need an asynchronous/continuous speech recognition library. Asynchronous means that I need endless running of the recognition until the spoken matches to an array of words without any input from a keyboard, and then display the spoken to the terminal and restart recognition. I already had a look at PocketSphinx, but after a few hours Googling, I didn't find anything about an Asynchronous recognition with that.

你知道有哪家图书馆能做到这一点吗?

Do you know any Library who is capable of that?

推荐答案

您可以在 Raspberry Pi 上使用 Pocketsphinx.您需要下载最新版本的 5prealpha.

You can use Pocketsphinx on Raspberry Pi. You need to download latest version 5prealpha.

它可以监听多个关键短语.代码应该是这样的:

It can listen for multiple keyphrases. The code should be something like this:

import sys, os
from pocketsphinx import *
import pyaudio

modeldir = "../../../model"

# Create a decoder with certain model
config = Decoder.default_config()
config.set_string('-hmm', os.path.join(modeldir, 'en-us/en-us'))
config.set_string('-dict', os.path.join(modeldir, 'en-us/cmudict-en-us.dict'))
config.set_string('-kws', 'keyphrase.list')

p = pyaudio.PyAudio()
stream = p.open(format=pyaudio.paInt16, channels=1, rate=16000, input=True, frames_per_buffer=1024)
stream.start_stream()

# Process audio chunk by chunk. On keyword detected perform action and restart search
decoder = Decoder(config)
decoder.start_utt()
while True:
    buf = stream.read(1024)
    decoder.process_raw(buf, False, False)
    if decoder.hyp() != None:
        print "Detected keyword", decoder.hyp(), "restarting search"
        decoder.end_utt()
        decoder.start_utt()

keypharse.list 文件应该是这样的,每行一个短语,带有阈值

The keypharse.list file should look like this, one phrase per line with threshold

open the door /1e-40/
close the door /1e-40/
how are you /1e-30/

必须为每个关键短语调整阈值,以在误报和误检测之间取得平衡.

Thresholds must be tuned for every keyphrase to balance between false alarms and misdetections.

这篇关于Python 中的 Raspberry Pi 异步/连续语音识别的文章就介绍到这了,希望我们推荐的答案对大家有所帮助,也希望大家多多支持IT屋!

查看全文
登录 关闭
扫码关注1秒登录
发送“验证码”获取 | 15天全站免登陆