使用librosa隔离音频前景并转换回音频流 [英] Isolating audio foreground and converting back to audio stream using librosa
问题描述
我试图隔离音频流的前景,然后使用librosa将其另存为独立音频流.
I'm trying to isolate the foreground of an audio stream and then save it as a standalone audio stream using librosa.
从此看似我已经隔离了完整的,前景和背景数据,如示例在S_full
,S_foreground
和S_background
中所做的那样,但是我不确定如何使用这些数据作为音频.
I have the full, foreground and background data isolated as the example does in S_full
, S_foreground
and S_background
but I'm unsure as to what to do to use those as audio.
我尝试使用 librosa.istft(...)进行转换,然后使用soundfile.write(...)
将其另存为.wav
文件,但剩下的文件大小大致正确,但数据不可用(?).
I attempted to use librosa.istft(...) to convert those and then save that as a .wav
file using soundfile.write(...)
but I'm left with a file of roughly the right size but unusable(?) data.
任何人都可以形容或指出我的例子吗?
Can anyone describe or point me at an example?
谢谢.
推荐答案
将最小的示例放在一起, 具有原始采样率的istft()实际上可以工作.
in putting together the minimal example, istft() with the original sampling rate does in fact work.
我会在某个地方找到我的错误. FWIW这是工作代码
I'll find my bug, somewhere. FWIW here's the working code
import numpy as np
import librosa
from librosa import display
import soundfile
import matplotlib.pyplot as plt
y, sr = librosa.load('audio/rb-testspeech.mp3', duration=5)
S_full, phase = librosa.magphase(librosa.stft(y))
S_filter = librosa.decompose.nn_filter(S_full,
aggregate=np.median,
metric='cosine',
width=int(librosa.time_to_frames(2, sr=sr)))
S_filter = np.minimum(S_full, S_filter)
margin_i, margin_v = 2, 10
power = 2
mask_v = librosa.util.softmask(S_full - S_filter,
margin_v * S_filter,
power=power)
S_foreground = mask_v * S_full
full = librosa.amplitude_to_db(S_full, ref=np.max)
librosa.display.specshow(full, y_axis='log', sr=sr)
plt.title('Full spectrum')
plt.colorbar()
plt.tight_layout()
plt.show()
print("y({}): {}".format(len(y),y))
print("sr: {}".format(sr))
full_audio = librosa.istft(S_full)
foreground_audio = librosa.istft(S_foreground)
print("full({}): {}".format(len(full_audio), full_audio))
soundfile.write('orig.WAV', y, sr)
soundfile.write('full.WAV', full_audio, sr)
soundfile.write('foreground.WAV', foreground_audio, sr)
这篇关于使用librosa隔离音频前景并转换回音频流的文章就介绍到这了,希望我们推荐的答案对大家有所帮助,也希望大家多多支持IT屋!