使用来自现场麦克风的pyaudio检测点击 [英] Detect tap with pyaudio from live mic
问题描述
我将如何使用pyaudio来检测现场麦克风突然发出的敲击声?
How would I use pyaudio to detect a sudden tapping noise from a live microphone?
推荐答案
一种方法:
- 一次读取一块样本, 说值得0.05秒
- 计算 块的RMS幅度(平方 的平方的均方根 单个样本)
- 如果该块的RMS幅度大于阈值,则为嘈杂的块",否则为安静的块"
- 突然敲击将是一个安静的区域,然后是少量的噪音区域,然后是一个安静的区域
- 如果您从不安静,那么您的门槛太低了
- 如果您永远不会听到嘈杂的声音,则说明您的门槛太高了
- read a block of samples at a time, say 0.05 seconds worth
- compute the RMS amplitude of the block (square root of the average of the squares of the individual samples)
- if the block's RMS amplitude is greater than a threshold, it's a "noisy block" else it's a "quiet block"
- a sudden tap would be a quiet block followed by a small number of noisy blocks followed by a quiet block
- if you never get a quiet block, your threshold is too low
- if you never get a noisy block, your threshold is too high
我的应用程序正在记录无人看管的有趣"噪声,因此只要有嘈杂的块,它就会记录下来.如果有15秒的嘈杂时间段(遮住耳朵"),则将阈值乘以1.1;如果存在15分钟的安静时间段(更难听"),则将阈值乘以0.9. ).您的应用程序将有不同的需求.
My application was recording "interesting" noises unattended, so it would record as long as there were noisy blocks. It would multiply the threshold by 1.1 if there was a 15-second noisy period ("covering its ears") and multiply the threshold by 0.9 if there was a 15-minute quiet period ("listening harder"). Your application will have different needs.
此外,刚刚注意到我的代码中有关观察到的RMS值的一些注释.在Macbook Pro的内置麦克风上,标准化的音频数据范围为+/- 1.0,输入音量设置为最大值,一些数据点:
Also, just noticed some comments in my code regarding observed RMS values. On the built in mic on a Macbook Pro, with +/- 1.0 normalized audio data range, with input volume set to max, some data points:
- 0.003-0.006(-50dB至-44dB)是我家中令人讨厌的中央暖气风扇 在同一台笔记本电脑上键入
- 0.010-0.40(-40dB至-8dB)
- 0.10(-20dB)在1'距离处轻柔地弹动手指
- 0.60(-4.4dB)在1'处大声弹响手指
- 0.003-0.006 (-50dB to -44dB) an obnoxiously loud central heating fan in my house
- 0.010-0.40 (-40dB to -8dB) typing on the same laptop
- 0.10 (-20dB) snapping fingers softly at 1' distance
- 0.60 (-4.4dB) snapping fingers loudly at 1'
更新:这是一个让您入门的示例.
Update: here's a sample to get you started.
#!/usr/bin/python
# open a microphone in pyAudio and listen for taps
import pyaudio
import struct
import math
INITIAL_TAP_THRESHOLD = 0.010
FORMAT = pyaudio.paInt16
SHORT_NORMALIZE = (1.0/32768.0)
CHANNELS = 2
RATE = 44100
INPUT_BLOCK_TIME = 0.05
INPUT_FRAMES_PER_BLOCK = int(RATE*INPUT_BLOCK_TIME)
# if we get this many noisy blocks in a row, increase the threshold
OVERSENSITIVE = 15.0/INPUT_BLOCK_TIME
# if we get this many quiet blocks in a row, decrease the threshold
UNDERSENSITIVE = 120.0/INPUT_BLOCK_TIME
# if the noise was longer than this many blocks, it's not a 'tap'
MAX_TAP_BLOCKS = 0.15/INPUT_BLOCK_TIME
def get_rms( block ):
# RMS amplitude is defined as the square root of the
# mean over time of the square of the amplitude.
# so we need to convert this string of bytes into
# a string of 16-bit samples...
# we will get one short out for each
# two chars in the string.
count = len(block)/2
format = "%dh"%(count)
shorts = struct.unpack( format, block )
# iterate over the block.
sum_squares = 0.0
for sample in shorts:
# sample is a signed short in +/- 32768.
# normalize it to 1.0
n = sample * SHORT_NORMALIZE
sum_squares += n*n
return math.sqrt( sum_squares / count )
class TapTester(object):
def __init__(self):
self.pa = pyaudio.PyAudio()
self.stream = self.open_mic_stream()
self.tap_threshold = INITIAL_TAP_THRESHOLD
self.noisycount = MAX_TAP_BLOCKS+1
self.quietcount = 0
self.errorcount = 0
def stop(self):
self.stream.close()
def find_input_device(self):
device_index = None
for i in range( self.pa.get_device_count() ):
devinfo = self.pa.get_device_info_by_index(i)
print( "Device %d: %s"%(i,devinfo["name"]) )
for keyword in ["mic","input"]:
if keyword in devinfo["name"].lower():
print( "Found an input: device %d - %s"%(i,devinfo["name"]) )
device_index = i
return device_index
if device_index == None:
print( "No preferred input found; using default input device." )
return device_index
def open_mic_stream( self ):
device_index = self.find_input_device()
stream = self.pa.open( format = FORMAT,
channels = CHANNELS,
rate = RATE,
input = True,
input_device_index = device_index,
frames_per_buffer = INPUT_FRAMES_PER_BLOCK)
return stream
def tapDetected(self):
print("Tap!")
def listen(self):
try:
block = self.stream.read(INPUT_FRAMES_PER_BLOCK)
except IOError as e:
# dammit.
self.errorcount += 1
print( "(%d) Error recording: %s"%(self.errorcount,e) )
self.noisycount = 1
return
amplitude = get_rms( block )
if amplitude > self.tap_threshold:
# noisy block
self.quietcount = 0
self.noisycount += 1
if self.noisycount > OVERSENSITIVE:
# turn down the sensitivity
self.tap_threshold *= 1.1
else:
# quiet block.
if 1 <= self.noisycount <= MAX_TAP_BLOCKS:
self.tapDetected()
self.noisycount = 0
self.quietcount += 1
if self.quietcount > UNDERSENSITIVE:
# turn up the sensitivity
self.tap_threshold *= 0.9
if __name__ == "__main__":
tt = TapTester()
for i in range(1000):
tt.listen()
这篇关于使用来自现场麦克风的pyaudio检测点击的文章就介绍到这了,希望我们推荐的答案对大家有所帮助,也希望大家多多支持IT屋!