音频文件的语音识别 [英] Speech recognition for audio files

查看:231
本文介绍了音频文件的语音识别的处理方法,对大家解决问题具有一定的参考价值,需要的朋友们下面随着小编来一起学习吧!

问题描述

大家好,

我正在开发一个项目,我需要将手机录音文件转录成文字;并且每天大约有200个音频文件。我使用Dictationgrammar尝试了用于桌面的Microsoft Speech SDK(system.speech.recognition):

 speechRecognizer.LoadGrammar( new  DictationGrammar()); 



但结果在准确性方面太糟糕了。

我认为最好使用服务器方法(Microsoft.speech.Recognition),我必须为此构建一个语法。但是如何构建超过10,000字的自由语法?

请帮忙!

谢谢



我尝试过:



用于桌面的Microsoft语音识别SDK。

解决方案

< blockquote>你无法修复引擎。语音识别确实有效,但质量是微软的壮观的失败

它可以使用最小尺寸的语法,没有任何听起来甚至类似的短语,对于用户而言相当不错的发音。更多短语和引擎会混淆大多数短语。



听写?忘了吧。



我称之为惨败的一个原因是在Android上我可以用不同的语言做很多口述(那里)是免费提供的听写键盘,其识别质量使其成为小型Android键盘的实用竞争对手。如果不与这个软件进行比较,我可能会称微软提供的引擎为一项成就,因为人们可以将它用于一些简约的语音界面。



< DD> -SA

Hi all,
I am working on a project that I need to transcribe phone recording files to text; and there are about 200 audio files/day. I tried Microsoft Speech SDK for desktop (system.speech.recognition) by using Dictationgrammar:

speechRecognizer.LoadGrammar(new DictationGrammar());


but the result is too bad in term of accuracy.
I thought that it is better to use the Server Method (Microsoft.speech.Recognition) and I have to build a grammar for this. But how to build a free speech grammar which is over 10,000 words?
Please help!
thanks

What I have tried:

Microsoft Speech recognition SDK for Desktop.

解决方案

You cannot fix the engine. Speech recognition does work, but the quality is Microsoft's spectacular failure.
It can work with minimal size grammar, with no phrases which would sound even remotely similar, for the user with fairly good pronunciation. More phrases, and the engine will mix up most of the phrases.

Dictation? Just forget it.

One reason I called this shame "spectacular failure" is that on Android I can do a lot of dictation in different language (there are dictation keyboards, freely available, free of charge) where the quality of recognition makes it a practical competitor to smallish Android keyboards. If not comparison with this software, I would probably call the engine supplied by Microsoft "an achievement", because one really could use it for some minimalistic voice interfaces.

—SA


这篇关于音频文件的语音识别的文章就介绍到这了,希望我们推荐的答案对大家有所帮助,也希望大家多多支持IT屋!

查看全文
登录 关闭
扫码关注1秒登录
发送“验证码”获取 | 15天全站免登陆