从FFT数据创建波形数据? [英] Creating wave data from FFT data?

查看:137
本文介绍了从FFT数据创建波形数据?的处理方法,对大家解决问题具有一定的参考价值,需要的朋友们下面随着小编来一起学习吧!

问题描述

您可能会注意到,我真的是python和声音处理的新手.我(希望)使用python以及logfbank和mfcc函数从wave文件中提取了FFT数据. (logfbank似乎提供了最有希望的数据,mfcc的输出对我来说有点奇怪).

As you might notice, i am really new to python and sound processing. I (hopefully) extracted FFT data from a wave file using python and the logfbank and mfcc function. (The logfbank seems to give the most promising data, mfcc output looked a bit weird for me).

在我的程序中,我想更改logfbank/mfcc数据,然后从中创建wave数据(并将它们写入文件).我真的没有找到有关从FFT数据创建波形数据的过程的任何信息.你们当中有人知道如何解决这个问题吗?我将不胜感激:)

In my program i want to change the logfbank/mfcc data and then create wave data from it (and write them into a file). I didn't really find any information about the process of creating wave data from FFT data. Does anyone of you have an idea how to solve this? I would appreciate it a lot :)

到目前为止,这是我的代码:

This is my code so far:

from scipy.io import wavfile 
import numpy as np
from python_speech_features import mfcc, logfbank

rate, signal = wavfile.read('orig.wav')
fbank = logfbank(signal, rate, nfilt=100, nfft=1400).T
mfcc = mfcc(signal, rate, numcep=13, nfilt=26, nfft=1103).T 

#magic data processing of fbank or mfcc here

#creating wave data and writing it back to a .wav file here

推荐答案

可以使用 constant-overlap-add 属性.

A suitably constructed STFT spectrogram containing both magnitude and phase can be converted back to a time-domain waveform using the Overlap Add method. Important thing is that the spectrogram construction must have the constant-overlap-add property.

让您的修改正确地操作频谱图的幅度和相位可能是具有挑战性的.因此,有时会丢弃相位,并独立控制幅度.为了将其转换回波形,然后必须在重建(相位重建)期间估计相位信息.这是一个有损的过程,通常计算量很大.已建立的方法使用迭代算法,通常是Griffin-Lim的一种变体.但是现在也有了使用卷积神经网络的新方法.

It can be challenging to have your modifications correctly manipulate both magnitude and phase of a spectrogram. So sometimes the phase is discarded, and magnitude manipulated independently. In order to convert this back into a waveform one must then estimate phase information during reconstruction (phase reconstruction). This is a lossy process, and usually pretty computationally intensive. Established approaches use an iterative algorithm, usually a variation on Griffin-Lim. But there are now also new methods using Convolutional Neural Networks.

librosa版本0.7.0 包含快速的Griffin -Lim的实现以及辅助函数可以反转MFCC的梅尔谱图.

librosa version 0.7.0 contains a fast Griffin-Lim implementation as well as helper functions to invert a mel-spectrogram of MFCC.

下面是一个代码示例.输入测试文件位于 https://github.com/jonnor/machinehearing/blob/ab7fe72807e9519af0151ec4f7ebfd890f432c83/handson/spectrogram-inversion/436951__arnaud-coutancier__old-ladies-pets-and-train-02.flac

Below is a code example. The input test file is found at https://github.com/jonnor/machinehearing/blob/ab7fe72807e9519af0151ec4f7ebfd890f432c83/handson/spectrogram-inversion/436951__arnaud-coutancier__old-ladies-pets-and-train-02.flac

import numpy
import librosa
import soundfile

# parameters
sr = 22050
n_mels = 128
hop_length = 512
n_iter = 32
n_mfcc = None # can try n_mfcc=20

# load audio and create Mel-spectrogram
path = '436951__arnaud-coutancier__old-ladies-pets-and-train-02.flac'
y, _ = librosa.load(path, sr=sr)
S = numpy.abs(librosa.stft(y, hop_length=hop_length, n_fft=hop_length*2))
mel_spec = librosa.feature.melspectrogram(S=S, sr=sr, n_mels=n_mels, hop_length=hop_length)

# optional, compute MFCCs in addition
if n_mfcc is not None:
    mfcc = librosa.feature.mfcc(S=librosa.power_to_db(S), sr=sr, n_mfcc=n_mfcc)
    mel_spec = librosa.feature.inverse.mfcc_to_mel(mfcc, n_mels=n_mels)

# Invert mel-spectrogram
S_inv = librosa.feature.inverse.mel_to_stft(mel_spec, sr=sr, n_fft=hop_length*4)
y_inv = librosa.griffinlim(S_inv, n_iter=n_iter,
                            hop_length=hop_length)

soundfile.write('orig.wav', y, samplerate=sr)
soundfile.write('inv.wav', y_inv, samplerate=sr)

结果

重构后的波形会有一些伪像.

Results

The reconstructed waveform will have some artifacts.

上面的示例重复出现了很多噪音,超出了我的预期.使用Audacity中的标准降噪算法可以将其降低很多.

The above example got a lot of repetitive noise, more than I expected. It was possible to reduce it quite a lot using the standard Noise Reduction algorithm in Audacity.

这篇关于从FFT数据创建波形数据?的文章就介绍到这了,希望我们推荐的答案对大家有所帮助,也希望大家多多支持IT屋!

查看全文
登录 关闭
扫码关注1秒登录
发送“验证码”获取 | 15天全站免登陆