从音频到张量,再回到 tensorflow 中的音频 [英] From audio to tensor, back to audio in tensorflow
问题描述
有没有什么办法可以在tensorflow中直接将音频文件(wav)加载到张量中?然后,再次将张量转换为音频文件?我看到有人将音频转换为频谱图,但我找不到任何人可以将频谱图转换为音频.
Is there any way to directly load an audio file (wav) to a tensor in tensorflow? And then, converting the tensor into an audio file again? I saw some people transforming audio into spectograms, but I couldn't find anyone that could convert from the spectogram to audio.
推荐答案
TensorFlow 1.x:
tf.contrib.ffmpeg.decode_audio()
op 可以将音频数据(包括 WAV 格式)加载到张量中,tf.contrib.ffmpeg.encode_audio()
可以将其转换回音频数据.
TensorFlow 1.x:
The tf.contrib.ffmpeg.decode_audio()
op can load audio data (including in WAV format) into a tensor, and the tf.contrib.ffmpeg.encode_audio()
can covert it back into audio data.
input_filename = tf.placeholder(tf.string, shape=[])
output_filename = tf.placeholder(tf.string, shape=[])
input_signal = tf.contrib.ffmpeg.decode_audio(
tf.read_file(input_filename), file_format="wav",
samples_per_second=44100, channel_count=2)
# ...
output_signal = ... # A 2-D tensor, [samples x channels]
encoded_audio_data = tf.contrib.ffmpeg.encode_audio(
output_signal, file_format="wav", samples_per_second=44100)
write_file_op = tf.write_file(output_filename, encoded_audio_data)
with tf.Session() as sess:
sess.run(write_file_op, {input_filename: "input.wav",
output_filename: "output.wav"})
TensorFlow 2.x
tf.contrib
模块已被弃用,但您仍然可以使用 Eager Execution 和 tf.audio
:
TensorFlow 2.x
The tf.contrib
module has been deprecated, but you are still able to load and save audio files in 16-bit PCM WAV format using eager execution and tf.audio
:
# Returns a tuple of Tensor objects (audio, sample_rate).
input_signal = tf.audio.decode_wav("input.wav")
# Returns a Tensor of type string.
output_signal = tf.audio.encode_wav(input_signal[0], input_signal[1])
这篇关于从音频到张量,再回到 tensorflow 中的音频的文章就介绍到这了,希望我们推荐的答案对大家有所帮助,也希望大家多多支持IT屋!