如何以python和ffmpeg或类似格式读取实时麦克风音频音量 [英] How to read realtime microphone audio volume in python and ffmpeg or similar
问题描述
我正在尝试近实时读取音量,该音量来自Python中USB麦克风的音频.
I'm trying to read, in near-realtime, the volume coming from the audio of a USB microphone in Python.
我有一些作品,但不知道如何将它们组合在一起.
I have the pieces, but can't figure out how to put it together.
如果我已经有一个.wav文件,我可以很简单地使用 wavefile 读取它:
If I already have a .wav file, I can pretty simply read it using wavefile:
from wavefile import WaveReader
with WaveReader("/Users/rmartin/audio.wav") as r:
for data in r.read_iter(size=512):
left_channel = data[0]
volume = np.linalg.norm(left_channel)
print volume
这很好用,但是我想实时处理来自麦克风的音频,而不是来自文件的音频.
This works great, but I want to process the audio from the microphone in real-time, not from a file.
所以我的想法是使用ffmpeg之类的东西将实时输出PIPE实时输入到WaveReader中,但是我的Byte知识却有些缺乏.
So my thought was to use something like ffmpeg to PIPE the real-time output into WaveReader, but my Byte knowledge is somewhat lacking.
import subprocess
import numpy as np
command = ["/usr/local/bin/ffmpeg",
'-f', 'avfoundation',
'-i', ':2',
'-t', '5',
'-ar', '11025',
'-ac', '1',
'-acodec','aac', '-']
pipe = subprocess.Popen(command, stdout=subprocess.PIPE, bufsize=10**8)
stdout_data = pipe.stdout.read()
audio_array = np.fromstring(stdout_data, dtype="int16")
print audio_array
看起来很漂亮,但是没有做太多.它失败,并出现 [NULL @ 0x7ff640016600]无法为'pipe:'错误找到合适的输出格式.
That looks pretty, but it doesn't do much. It fails with a [NULL @ 0x7ff640016600] Unable to find a suitable output format for 'pipe:' error.
鉴于我只需要检查音频的音量,我认为这是一件相当简单的事情.
I assume this is a fairly simple thing to do given that I only need to check the audio for volume levels.
有人知道如何简单地做到这一点吗? FFMPEG不是必需的,但确实需要在OSX& Linux.
Anyone know how to accomplish this simply? FFMPEG isn't a requirement, but it does need to work on OSX & Linux.
推荐答案
感谢@Matthias建议使用sounddevice模块.这正是我所需要的.
Thanks to @Matthias for the suggestion to use the sounddevice module. It's exactly what I need.
对于后代,这是一个工作示例,该示例将实时音频电平打印到外壳:
For posterity, here is a working example that prints real-time audio levels to the shell:
# Print out realtime audio volume as ascii bars
import sounddevice as sd
import numpy as np
def print_sound(indata, outdata, frames, time, status):
volume_norm = np.linalg.norm(indata)*10
print ("|" * int(volume_norm))
with sd.Stream(callback=print_sound):
sd.sleep(10000)
这篇关于如何以python和ffmpeg或类似格式读取实时麦克风音频音量的文章就介绍到这了,希望我们推荐的答案对大家有所帮助,也希望大家多多支持IT屋!