SFSpeechRecognizer - 检测话语结束 [英] SFSpeechRecognizer - detect end of utterance
问题描述
我正在使用iOS 10内置语音识别来攻击一个小项目。我有使用设备麦克风的工作结果,我的语音被非常准确地识别。
I am hacking a little project using iOS 10 built-in speech recognition. I have working results using device's microphone, my speech is recognized very accurately.
我的问题是每个可用的部分转录都会调用识别任务回调,我想要它检测到人们停止说话并使用 isFinal
属性设置为true来调用回调。它没有发生 - 应用程序正在无限期地收听。
My problem is that recognition task callback is called for every available partial transcription, and I want it to detect person stopped talking and call the callback with isFinal
property set to true. It is not happening - app is listening indefinitely.
SFSpeechRecognizer
是否能够检测到句尾?
Is SFSpeechRecognizer
ever capable of detecting end of sentence?
这是我的代码 - 它基于在互联网上找到的示例,它主要是从麦克风源识别所需的样板。
我通过添加识别 taskHint
来修改它。我还将 shouldReportPartialResults
设置为false,但它似乎已被忽略。
Here's my code - it is based on example found on the Internets, it is mostly a boilerplate needed to recognize from microphone source.
I modified it by adding recognition taskHint
. I also set shouldReportPartialResults
to false, but it seems it has been ignored.
func startRecording() {
if recognitionTask != nil {
recognitionTask?.cancel()
recognitionTask = nil
}
let audioSession = AVAudioSession.sharedInstance()
do {
try audioSession.setCategory(AVAudioSessionCategoryRecord)
try audioSession.setMode(AVAudioSessionModeMeasurement)
try audioSession.setActive(true, with: .notifyOthersOnDeactivation)
} catch {
print("audioSession properties weren't set because of an error.")
}
recognitionRequest = SFSpeechAudioBufferRecognitionRequest()
recognitionRequest?.shouldReportPartialResults = false
recognitionRequest?.taskHint = .search
guard let inputNode = audioEngine.inputNode else {
fatalError("Audio engine has no input node")
}
guard let recognitionRequest = recognitionRequest else {
fatalError("Unable to create an SFSpeechAudioBufferRecognitionRequest object")
}
recognitionRequest.shouldReportPartialResults = true
recognitionTask = speechRecognizer?.recognitionTask(with: recognitionRequest, resultHandler: { (result, error) in
var isFinal = false
if result != nil {
print("RECOGNIZED \(result?.bestTranscription.formattedString)")
self.transcriptLabel.text = result?.bestTranscription.formattedString
isFinal = (result?.isFinal)!
}
if error != nil || isFinal {
self.state = .Idle
self.audioEngine.stop()
inputNode.removeTap(onBus: 0)
self.recognitionRequest = nil
self.recognitionTask = nil
self.micButton.isEnabled = true
self.say(text: "OK. Let me see.")
}
})
let recordingFormat = inputNode.outputFormat(forBus: 0)
inputNode.installTap(onBus: 0, bufferSize: 1024, format: recordingFormat) { (buffer, when) in
self.recognitionRequest?.append(buffer)
}
audioEngine.prepare()
do {
try audioEngine.start()
} catch {
print("audioEngine couldn't start because of an error.")
}
transcriptLabel.text = "Say something, I'm listening!"
state = .Listening
}
推荐答案
当用户按预期停止说话时,似乎 isFinal 标志不会变为true。我想这是Apple的通缉行为,因为用户停止说话是一个未定义的事件。
It seems that isFinal flag doesn't became true when user stops talking as expected. I guess this is a wanted behaviour by Apple, because the event "User stops talking" is an undefined event.
我相信实现目标的最简单方法是执行以下操作:
I believe that the easiest way to achieve your goal is to do the following:
-
您必须建立静默间隔。这意味着如果用户的谈话时间不超过你的间隔时间,他就会停止说话(即2秒)。
You have to estabilish an "interval of silence". That means if the user doesn't talk for a time greater than your interval, he has stopped talking (i.e. 2 seconds).
创建一个 音频会话开头的计时器
:
Create a Timer at the beginning of the audio session
:
var timer = NSTimer.scheduledTimerWithTimeInterval(2,target:self,selector:didFinishTalk,userInfo:nil,repeats:false)
-
当你在
中获得新的转录时识别任务
无效并重启你的计时器
when you get new transcriptions in
recognitionTask
invalidate and restart your timer
timer.invalidate()
timer = NSTimer.scheduledTimerWithTimeInterval(2,target:self,selector:didFinishTalk,userInfo:nil,repeats:false)
如果计时器到期,则表示用户在2秒内没有通话。您可以安全地停止音频会话并退出
if the timer expires this means the user doesn't talk from 2 seconds. You can safely stop Audio Session and exit
这篇关于SFSpeechRecognizer - 检测话语结束的文章就介绍到这了,希望我们推荐的答案对大家有所帮助,也希望大家多多支持IT屋!