如何通过语音框架实现语音转文本 [英] How to implement speech-to-text via Speech framework

查看:96
本文介绍了如何通过语音框架实现语音转文本的处理方法,对大家解决问题具有一定的参考价值,需要的朋友们下面随着小编来一起学习吧!

问题描述

我想使用iOS语音框架在Objective-C应用中进行语音识别.

I want to do speech recognition in my Objective-C app using the iOS Speech framework.

我找到了一些Swift示例,但是在Objective-C中什么也找不到.

I found some Swift examples but haven't been able to find anything in Objective-C.

是否可以从Objective-C访问此框架?如果可以,怎么办?

Is it possible to access this framework from Objective-C? If so, how?

推荐答案

花了足够的时间查找Objective-C示例-即使在Apple文档中-我也找不到合适的东西,所以我自己弄清楚了.

After spending enough time looking for Objective-C samples -even in the Apple documentation- I couldn't find anything decent, so I figured it out myself.

/*!
 * Import the Speech framework, assign the Delegate and declare variables
 */

#import <Speech/Speech.h>

@interface ViewController : UIViewController <SFSpeechRecognizerDelegate> {
    SFSpeechRecognizer *speechRecognizer;
    SFSpeechAudioBufferRecognitionRequest *recognitionRequest;
    SFSpeechRecognitionTask *recognitionTask;
    AVAudioEngine *audioEngine;
}

方法文件(.m)

- (void)viewDidLoad {
    [super viewDidLoad];

    // Initialize the Speech Recognizer with the locale, couldn't find a list of locales
    // but I assume it's standard UTF-8 https://wiki.archlinux.org/index.php/locale
    speechRecognizer = [[SFSpeechRecognizer alloc] initWithLocale:[[NSLocale alloc] initWithLocaleIdentifier:@"en_US"]];

    // Set speech recognizer delegate
    speechRecognizer.delegate = self;

    // Request the authorization to make sure the user is asked for permission so you can
    // get an authorized response, also remember to change the .plist file, check the repo's
    // readme file or this project's info.plist
    [SFSpeechRecognizer requestAuthorization:^(SFSpeechRecognizerAuthorizationStatus status) {
        switch (status) {
            case SFSpeechRecognizerAuthorizationStatusAuthorized:
                NSLog(@"Authorized");
                break;
            case SFSpeechRecognizerAuthorizationStatusDenied:
                NSLog(@"Denied");
                break;
            case SFSpeechRecognizerAuthorizationStatusNotDetermined:
                NSLog(@"Not Determined");
                break;
            case SFSpeechRecognizerAuthorizationStatusRestricted:
                NSLog(@"Restricted");
                break;
            default:
                break;
        }
    }];

}

/*!
 * @brief Starts listening and recognizing user input through the 
 * phone's microphone
 */

- (void)startListening {

    // Initialize the AVAudioEngine
    audioEngine = [[AVAudioEngine alloc] init];

    // Make sure there's not a recognition task already running
    if (recognitionTask) {
        [recognitionTask cancel];
        recognitionTask = nil;
    }

    // Starts an AVAudio Session
    NSError *error;
    AVAudioSession *audioSession = [AVAudioSession sharedInstance];
    [audioSession setCategory:AVAudioSessionCategoryRecord error:&error];
    [audioSession setActive:YES withOptions:AVAudioSessionSetActiveOptionNotifyOthersOnDeactivation error:&error];

    // Starts a recognition process, in the block it logs the input or stops the audio
    // process if there's an error.
    recognitionRequest = [[SFSpeechAudioBufferRecognitionRequest alloc] init];
    AVAudioInputNode *inputNode = audioEngine.inputNode;
    recognitionRequest.shouldReportPartialResults = YES;
    recognitionTask = [speechRecognizer recognitionTaskWithRequest:recognitionRequest resultHandler:^(SFSpeechRecognitionResult * _Nullable result, NSError * _Nullable error) {
        BOOL isFinal = NO;
        if (result) {
            // Whatever you say in the microphone after pressing the button should be being logged
            // in the console.
            NSLog(@"RESULT:%@",result.bestTranscription.formattedString);
            isFinal = !result.isFinal;
        }
        if (error) {
            [audioEngine stop];
            [inputNode removeTapOnBus:0];
            recognitionRequest = nil;
            recognitionTask = nil;
        }
    }];

    // Sets the recording format
    AVAudioFormat *recordingFormat = [inputNode outputFormatForBus:0];
    [inputNode installTapOnBus:0 bufferSize:1024 format:recordingFormat block:^(AVAudioPCMBuffer * _Nonnull buffer, AVAudioTime * _Nonnull when) {
        [recognitionRequest appendAudioPCMBuffer:buffer];
    }];

    // Starts the audio engine, i.e. it starts listening.
    [audioEngine prepare];
    [audioEngine startAndReturnError:&error];
    NSLog(@"Say Something, I'm listening"); 
}

- (IBAction)microPhoneTapped:(id)sender {
    if (audioEngine.isRunning) {
        [audioEngine stop];
        [recognitionRequest endAudio];
    } else {
        [self startListening];
    }
}

现在,将委托添加SFSpeechRecognizerDelegate以检查语音识别器是否可用.

Now, add the delegate the SFSpeechRecognizerDelegate to check if the speech recognizer is available.

#pragma mark - SFSpeechRecognizerDelegate Delegate Methods

- (void)speechRecognizer:(SFSpeechRecognizer *)speechRecognizer availabilityDidChange:(BOOL)available {
    NSLog(@"Availability:%d",available);
}

说明和笔记

请记住修改.plist文件以获取用户的语音识别授权并使用麦克风,当然<String>值必须根据您的需要进行自定义,您可以通过在Property List中创建和修改值来实现>或右键单击.plist文件和Open As-> Source Code,然后在</dict>标记之前粘贴以下几行.

Instructions & Notes

Remember to modify the .plist file to get user's authorization for Speech Recognition and using the microphone, of course the <String> value must be customized to your needs, you can do this by creating and modifying the values in the Property List or right-click on the .plist file and Open As -> Source Code and paste the following lines before the </dict> tag.

<key>NSMicrophoneUsageDescription</key>  <string>This app uses your microphone to record what you say, so watch what you say!</string>

<key>NSSpeechRecognitionUsageDescription</key>  <string>This app uses Speech recognition to transform your spoken words into text and then analyze the, so watch what you say!.</string>

还请记住,为了能够将Speech框架导入项目,您需要具有iOS 10.0 +.

Also remember that in order to be able to import the Speech framework into the project you need to have iOS 10.0+.

要运行此程序并对其进行测试,您只需要一个非常基本的UI,只需创建一个UIButton并为其分配microPhoneTapped操作即可,当按下该应用程序时,应开始监听并将通过麦克风听到的所有声音记录到控制台(示例代码NSLog中是接收文本的唯一内容).再次按下该键应停止录制.

To get this running and test it you just need a very basic UI, just create an UIButton and assign the microPhoneTapped action to it, when pressed the app should start listening and logging everything that it hears through the microphone to the console (in the sample code NSLog is the only thing receiving the text). It should stop the recording when pressed again.

我创建了一个带有示例项目的 Github存储库,请尽情享受!

I created a Github repo with a sample project, enjoy!

这篇关于如何通过语音框架实现语音转文本的文章就介绍到这了,希望我们推荐的答案对大家有所帮助,也希望大家多多支持IT屋!

查看全文
登录 关闭
扫码关注1秒登录
发送“验证码”获取 | 15天全站免登陆