使用Sphinx 4进行mp3识别 [英] mp3 recognition using Sphinx 4

查看:100
本文介绍了使用Sphinx 4进行mp3识别的处理方法,对大家解决问题具有一定的参考价值,需要的朋友们下面随着小编来一起学习吧!

问题描述

我们可以在不使用wav文件的情况下将mp3文件用于语音识别过程吗?还是我们可以从mp3生成wav文件,然后进行语音识别,而不会严重影响准确性?问题是我需要在应用程序中最小化通过网络传输的负载.转换中丢失的信息会成为准确性的重要因素吗?

Can we use mp3 files for the voice recognition process without using wav files? or can we generate a wav file from a mp3 and then do the voice recognition without a serious impact on the accuracy? The problem is I need to minimize the load transferred through the network in my application. Will the information which is lost in the conversion be a huge factor for accuracy?

推荐答案

我们可以在不使用mp3文件的情况下进行语音识别吗? WAV文件?

Can we use mp3 files for the voice recognition process without using wav files?

不直接.为了能够识别mp3流,您需要使用java库来读取mp3并将其转换为pcm流( tritonus -mp3 lameonj ).您还可以将ffmpeg作为一个单独的过程进行解码.

Not directly. To be able to recognize mp3 streams, you need to use java library to read mp3 and convert to pcm stream (tritonus-mp3, lameonj). You can also invoke ffmpeg as a separate process to decode.

还是我们可以从mp3生成wav文件,然后进行语音识别,而不会严重影响准确性?

or can we generate a wav file from a mp3 and then do the voice recognition without a serious impact on the accuracy?

在两种情况下,准确性都将受到影响,无论您在哪里解码mp3文件.

Accuracy is affected in both cases, no matter where you decode mp3 file.

问题是我需要将通过 网络在我的应用程序中.遗失的信息会否 转换是否会成为准确性的重要因素?

The problem is I need to minimize the load transferred through the network in my application. Will the information which is lost in the conversion be a huge factor for accuracy?

最好使用flac等无损编解码器进行传输. mp3转换会降低ASR准确性.另一种方法是在客户端上计算功能并将其传输到服务器.

It's better to use losseless codec like flac for transfer. mp3 conversion degrades ASR accuracy. Another approach would be to calculate features on the client and transfer them to the server.

这篇关于使用Sphinx 4进行mp3识别的文章就介绍到这了,希望我们推荐的答案对大家有所帮助,也希望大家多多支持IT屋!

查看全文
登录 关闭
扫码关注1秒登录
发送“验证码”获取 | 15天全站免登陆