如何提高Watson Speech to Text的准确性? [英] How can I improve Watson Speech to Text accuracy?

查看:167
本文介绍了如何提高Watson Speech to Text的准确性?的处理方法,对大家解决问题具有一定的参考价值,需要的朋友们下面随着小编来一起学习吧!

问题描述

我了解到Watson语音转文本已针对口语对话和1或2个说话者进行了一定程度的校准.我也知道,与WAV和OGG相比,它可以更好地处理FLAC.

I understand that Watson Speech To Text is somewhat calibrated for colloquial conversation and for 1 or 2 speakers. I also know that it can deal with FLAC better than WAV and OGG.

从声学上讲,我想知道如何改善算法识别能力.

I would like to know how could I improve the algorithm recognition, acoustically speaking.

我的意思是,增加音量有帮助吗?也许使用一些压缩过滤器?降低噪音?

I mean, does increasing volume help? Maybe using some compression filter? Noise reduction?

什么样的预处理可以帮助这项服务?

What kind of pre processing could help for this service?

推荐答案

提高基本模型(非常准确,但也非常通用)的准确性的最佳方法是使用Watson STT定制服务: https://www.ibm.com/watson/developercloud/doc /speech-to-text/custom.html .这样一来,您就可以创建一个针对您的域的具体情况定制的自定义模型.如果您的网域与基本模型所捕获的网域不是很好地匹配,那么您可以期望识别精度会大大提高.

the best way to improve the accuracy of the base models (which are very accurate but also very general) is by using the Watson STT customization service: https://www.ibm.com/watson/developercloud/doc/speech-to-text/custom.html. That will let you create a custom model tailored to the specifics of your domain. If your domain is not very well matched to those captured by the base model then you can expect a great boost in recognition accuracy.

关于您的评论我也知道它可以比WAV和OGG更好地处理FLAC",事实并非如此. Watson STT服务完全支持flac,wav,ogg和其他格式(请参阅文档的此部分:

Regearding your comment " I also know that it can deal with FLAC better than WAV and OGG", that is not really the case. The Watson STT service offers full support for flac, wav, ogg and other formats (please see this section of the documentation: https://www.ibm.com/watson/developercloud/doc/speech-to-text/input.html#formats).

这篇关于如何提高Watson Speech to Text的准确性?的文章就介绍到这了,希望我们推荐的答案对大家有所帮助,也希望大家多多支持IT屋!

查看全文
登录 关闭
扫码关注1秒登录
发送“验证码”获取 | 15天全站免登陆