使用Librosa生成的频谱图看起来与Kaldi不一致? [英] Spectrograms generated using Librosa don't look consistent with Kaldi?

查看：737 发布时间：2020/6/30 21:09:27 speech-recognition spectrogram mfcc librosa kaldi

本文介绍了使用Librosa生成的频谱图看起来与Kaldi不一致?的处理方法，对大家解决问题具有一定的参考价值，需要的朋友们下面随着小编来一起学习吧！

问题描述

我使用来自Kaldi的"egs/tidigits"代码，使用23个bin，20kHz采样率，25ms窗口和10ms移位生成了七"种发音的声谱图.频谱图显示如下，通过MATLAB imagesc函数可视化:

I generated spectrogram of a "seven" utterance using the "egs/tidigits" code from Kaldi, using 23 bins, 20kHz sampling rate, 25ms window, and 10ms shift. Spectrogram appears as below visualized via MATLAB imagesc function:

我正在尝试使用Librosa替代Kaldi.我使用与上面相同的箱数，采样率和窗口长度/移位，如下设置我的代码.

I am experimenting with using Librosa as an alternative to Kaldi. I set up my code as below using the same number of bins, sampling rate, and window length / shift as above.

time_series, sample_rate = librosa.core.load("7a.wav",sr=20000)
spectrogram = librosa.feature.melspectrogram(time_series, sr=20000, n_mels=23, n_fft=500, hop_length=200)
log_S = librosa.core.logamplitude(spectrogram)
np.savetxt("7a.txt", log_S.T)

但是，当我可视化同一WAV文件的结果Librosa频谱图时，它看起来却有所不同:

However when I visualize the resulting Librosa spectrogram of the same WAV file it looks different:

有人可以帮我理解为什么它们看起来如此不同吗?在其他的WAV文件中，我尝试过使用上述Librosa脚本时，我的摩擦音(如上例中的"seven"中的/s/)被切断，这极大地影响了我的数字分类精度.谢谢！

Can someone please help me understand why these look so different? Across other WAV files I've tried I notice that with my Librosa script above, my fricatives (like the /s/ in "seven" in the above example) are being cutoff and this is greatly affecting my digit classification accuracy. Thank you!

使用Librosa生成的频谱图看起来与Kaldi不一致? [英] Spectrograms generated using Librosa don't look consistent with Kaldi?

问题描述

推荐答案

相关文章

其他开发最新文章

热门教程

热门工具

登录关闭

使用Librosa生成的频谱图看起来与Kaldi不一致? [英] Spectrograms generated using Librosa don&#39;t look consistent with Kaldi?

问题描述

推荐答案

相关文章

其他开发最新文章

热门教程

热门工具

登录 关闭

使用Librosa生成的频谱图看起来与Kaldi不一致? [英] Spectrograms generated using Librosa don't look consistent with Kaldi?

登录关闭