实现“用户停止说话"“SFSpeechRecognizer"的通知 [英] Implementing "user stopped speaking" notification for `SFSpeechRecognizer`

查看:133
本文介绍了实现“用户停止说话"“SFSpeechRecognizer"的通知的处理方法,对大家解决问题具有一定的参考价值,需要的朋友们下面随着小编来一起学习吧!

问题描述

我正在尝试解决这个问题:SFSpeechRecognizer - 检测话语结束

I'm attempting to solve this problem: SFSpeechRecognizer - detect end of utterance

问题在于每次检测到的语音字符串更改时都会触发 SFSpeechRecognizer 回调,但它只会在 60 秒的静音后触发(因此它设置了 isFinal 标志).

The problem is that SFSpeechRecognizer callback fires every time the detected speech string changes, but it only fires after 60 seconds of silence (whereupon it sets the isFinal flag).

建议的技术是在每次回调触发时启动一个 2 秒的计时器,如果计时器已经设置,则首先使其无效.

The suggested technique is to start a 2 second timer each time to callback fires, first invalidating the timer if it is already set.

我已经实施了这项技术.但是,我的计时器回调永远不会被击中.

I have implemented this technique. However at my timer callback is never getting hit.

谁能告诉我为什么?

import Foundation
import Speech

@objc
public class Dictation : NSObject, SFSpeechRecognizerDelegate
{
    @objc static let notification_finalText = Notification.Name("speech_gotFinalText")
    @objc static let notification_interimText = Notification.Name("speech_textDidChange")

    private let speechRecognizer = SFSpeechRecognizer(locale: Locale(identifier: "en-UK"))!

    var recognitionRequest: SFSpeechAudioBufferRecognitionRequest?

    private var recognitionTask: SFSpeechRecognitionTask?

    let audioEngine = AVAudioEngine()

    @objc var text_tmp   : String? = ""
    @objc var text_final : String? = ""

    var timer : Timer?

    override init()
    {
        super.init()

        speechRecognizer.delegate = self

        SFSpeechRecognizer.requestAuthorization { authStatus in
            if authStatus != .authorized {
                exit(0)
            }
        }
    }

    // - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - -

    @objc
    func tryStartRecording()
    {
        try! startRecording()
    }

    // - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - -

    func startRecording() throws
    {
        text_final = ""

        // Cancel the previous task if it's running.
        if let recognitionTask = recognitionTask {
            recognitionTask.cancel()
            self.recognitionTask = nil
        }

        recognitionRequest = SFSpeechAudioBufferRecognitionRequest()

        let inputNode = audioEngine.inputNode
        /*
         ^ causes:
         [plugin] AddInstanceForFactory: No factory registered for id <CFUUID 0x600000247200> F8BB1C28-BAE8-11D6-9C31-00039315CD46
         HALC_ShellDriverPlugIn::Open: Can't get a pointer to the Open routine
         HALC_ShellDriverPlugIn::Open: Can't get a pointer to the Open routine
         */

        if inputNode.inputFormat(forBus: 0).sampleRate == 0 {
            fatalError("Audio engine has no input node")
        }

        guard let recognitionRequest = recognitionRequest else {
            fatalError("Unable to created a SFSpeechAudioBufferRecognitionRequest object")
        }

        // Configure request so that results are returned before audio recording is finished
        recognitionRequest.shouldReportPartialResults = true

        // A recognition task represents a speech recognition session.
        // We keep a reference to the task so that it can be cancelled.
        recognitionTask = speechRecognizer.recognitionTask( with: recognitionRequest )
        { result, error in
            self.timer?.invalidate()
            print( "New Timer" )
            self.timer = Timer(timeInterval:2.0, repeats:false) { _ in

                print( "*** Timer Callback -- NEVER HITS! ***" )

                self.timer?.invalidate()
                self.text_final = result!.bestTranscription.formattedString

                NotificationCenter.default.post( name: Dictation.notification_finalText,  object: nil )

                self.stopRecording()
            }

            var isFinal = false

            if let result = result {
                isFinal = result.isFinal

                if isFinal {
                    self.text_final = result.bestTranscription.formattedString
                } else {
                    self.text_tmp = result.bestTranscription.formattedString
                }

                let notification = isFinal ? Dictation.notification_finalText : Dictation.notification_interimText

                NotificationCenter.default.post( name: notification,  object: nil )
            }

            if error != nil  ||  isFinal {
                self.audioEngine.stop()
                inputNode.removeTap( onBus: 0 )

                self.recognitionRequest = nil
                self.recognitionTask = nil
            }
        }

        let recordingFormat = inputNode.outputFormat(forBus: 0)

        inputNode.installTap( onBus: 0,  bufferSize: 1024,  format: recordingFormat )
        { (buffer: AVAudioPCMBuffer, when: AVAudioTime) in
            self.recognitionRequest?.append( buffer )
        }

        audioEngine.prepare()

        try audioEngine.start()

        print( self.audioEngine.description )
    }

    // - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - -

    @objc
    func stopRecording()
    {
        audioEngine.stop()
        recognitionRequest?.endAudio()
    }
}

链接:
- SFSpeechRecognizer - 检测话语结束

推荐答案

这是因为您创建了计时器但从未启动它:

It's because you create the timer but you never start it:

self.timer = Timer(timeInterval:2.0, repeats:false)

相反,说

self.timer = Timer.scheduledTimer( ...

这篇关于实现“用户停止说话"“SFSpeechRecognizer"的通知的文章就介绍到这了,希望我们推荐的答案对大家有所帮助,也希望大家多多支持IT屋!

查看全文
登录 关闭
扫码关注1秒登录
发送“验证码”获取 | 15天全站免登陆