如何将Linear16 PCM WAV转换为具有与g711.org相同质量的G711 8位8-khz MULAW WAV? [英] How to convert Linear16 PCM wav to G711 8-bit 8-khz MULAW wav with same quality as g711.org?

查看:254
本文介绍了如何将Linear16 PCM WAV转换为具有与g711.org相同质量的G711 8位8-khz MULAW WAV?的处理方法,对大家解决问题具有一定的参考价值,需要的朋友们下面随着小编来一起学习吧!

问题描述

我正在使用NAudio尝试将来自第三方文本语音转换API的Linear16 PCM wav文件转换为G711 8位8-khz MULAW,它将用作电话提示.使用库作者文档中的技术以及一些堆栈溢出文章,特别是按照建议进行两步转换.

I am using NAudio to attempt to convert Linear16 PCM wav files that come out of a 3rd party Text-To-Speech API to G711 8-bit 8-khz MULAW that will work as a telephony prompt. Using techniques found in the library authors documentation and some stack overflow posts and specifically following suggestion to do a 2 step conversion.

dynamic foo = JsonConvert.DeserializeObject<dynamic>(result);

byte[] decoded = Convert.FromBase64String(foo.audioContent.ToString());

WaveFormat newFormat = new WaveFormat(8000, 16, 1);
WaveFormat mulaw = WaveFormat.CreateMuLawFormat(8000, 1);

using (MemoryStream mem = new MemoryStream(decoded))
using (WaveFileReader reader = new WaveFileReader(mem))
using (var conversionStream = new WaveFormatConversionStream(newFormat, reader))
using (var convStream2 = new WaveFormatConversionStream(mulaw, conversionStream))
{
     WaveFileWriter.CreateWaveFile("voiceprompt_downsample_8bit-8khz.wav", convStream2);
     File.WriteAllBytes("voiceprompt_raw.wav", decoded);
}

不幸的是,转换后的文件的最终音频质量会大大下降(这在一定程度上是可以预期的).但是,如果我使用与上述代码完全相同的源文件,并将其提交给转换器,请访问 g711.org ,然后选择"BroadWorks Classic(8Khz,单声道,u-law)"选项,结果音频听起来会好得多(尤其要注意的是,在我们的某些产品中,它不会像"access"和"password"之类的词来剪裁/粉碎S提示).

Unfortunately the resulting audio quality of the converted file is pretty degraded (which is to be expected to a degree). However if I take the exact same source file that I am running through the code above and submit it to the converter at g711.org and select the "BroadWorks Classic (8Khz, Mono, u-law)" option the resulting audio sounds much better (especially note that it is not clipping/crushing the S's in words like "access" and "password" in some of our prompts).

我已经确认这两个音频文件(使用NAudio转换的音频文件和使用g711.org生成的音频文件)在我们的电话系统中都能按提示正常播放.

I have confirmed that both audio files (the one I convert with NAudio and the one I generated using g711.org) play fine as prompts through our telephony system.

是否想知道有NAudio经验的人对我可以在NAudio中做些什么来获得转换后的文件的输出质量以匹配我从g711.org网站获得的输出的任何建议?

Wondering if anyone out there with NAudio experience has any suggestions about what I can do differently in NAudio to get the output quality of the converted file to match what I am getting out of the g711.org site?

推荐答案

我自己弄清楚了,问题是我需要使用其他选项之一对音频进行重新采样,而不仅仅是使用WaveFormatConversionStream.通过MediaFoundationResampler重新采样后,音频质量比通过WaveFormatConversionStream在ACM上获得的音频质量有了很大改善.

Figured it out myself, issue was I needed to be using one of the other options to resample the audio vs. just using WaveFormatConversionStream. After resampling with MediaFoundationResampler the audio quality was much improved over what I was getting with ACM via WaveFormatConversionStream.

doc 帮助我实现了这一目标. ..

This doc helped me come to that realization...

这篇关于如何将Linear16 PCM WAV转换为具有与g711.org相同质量的G711 8位8-khz MULAW WAV?的文章就介绍到这了,希望我们推荐的答案对大家有所帮助,也希望大家多多支持IT屋!

查看全文
登录 关闭
扫码关注1秒登录
发送“验证码”获取 | 15天全站免登陆