嵌入式软件中的回声消除(AEC) [英] Acoustic Echo Cancellation (AEC) in embedded software
问题描述
我正在嵌入式设备上执行VoIP项目。我使用具有低级音频编解码器的32位MCU构建了示例。现在,我发现我的设备上存在回声问题,即我可以听到扬声器说的话。我进行了一些研究,发现大多数应用程序使用具有回声消除功能的DSP编解码器。但是,是否可以使用我的32位MCU在软件中进行声学回声消除?
I am doing a VoIP project on embedded device. I have built a sample using a 32bits MCU with a low grade audio codec. Now I found that there is echo issue on my device, that is I can hear what I said from the speaker. I have do some research and found that most appliaction use a DSP codec with acoustic echo cancellation feature. However, is it possible that I do the acoustic echo cancellation in the software, using my 32bits MCU?
您能否为该算法或甚至源代码:P辩护?做回声消除?我知道在MCU上不可能使用复杂的方法,但是也欢迎使用简单的算法。
Can you adive the algorithm, or even source code:P, for doing acoustic echo cancellation? I know sophisticated method is not possible on a MCU, whereas a simple algorithm is also welcomed.
谢谢
[跟进]:我已经尝试了一些AEC代码,但是它们在我的MCU中无法正常工作,可能是MCU功率的限制。我发现实施这些代码后,我的设备变为非实时的(但是VoIP需要实时响应)。最后,我增加了AEC芯片,从而实现了模拟硬件解决方案,因为我不想在另一个DSP芯片中再次编写代码。
[Follow up] : I have tried some AEC code but they can not work well in my MCU, probably it is the limit of the MCU power. I found that my device become non-real-time when implemented these codes (but a VoIP need a real-time respond). At last I implemented a analog hardware solution by adding an AEC chips, because I do not want to write the code again in another DSP chip.
推荐答案
我曾经有段时间使用回声消除功能。我写了一个软件电话,用户可以根据自己的喜好切换音频输入和输出设备。我尝试了Speex回声消除库,以及在网上找到的其他几个开源库。没有一个对我有效。我尝试了不同的扬声器/麦克风配置,并且回声始终以某种形式或方式存在。
I had a heck of a time with echo cancellation. I wrote a softphone, and the user can switch their audio input and output devices around to suit their fancy. I tried the Speex echo cancellation library, and several other open source libs I found online. None worked well for me. I tried different speaker/mike configuration and the echo was always there in some form or fashion.
我相信创建适用于所有人的AEC代码将非常困难。可能的扬声器配置/房间大小/背景噪音等最终,我坐下来,用这种算法为我的软件电话编写了自己的回声消除模块。
I believe it would be very hard to create AEC code that would work for all possible speaker configurations / room sizes / background noises..etc. Finally I sat down and wrote my own echo cancellation module for my softphone with this algorithm.
虽然有些粗糙,但是效果很好并且可靠。
It's somewhat crude, but it has worked well and is reliable.
变量1:
记录下您正在与之交谈的人说话时的平均振幅。 (不考虑静默时间)
variable1: Keep a record of what the average amplitude is of when the person to whom you're talking is speaking. (Don't factor quiet-time)
变量2:
记录输入(麦克风)的平均幅度,但仅当存在再次是声音-不要把安静的时间都考虑在内。
variable2: Keep a record of what the average amplitude is on the input (mike), but only when there is voice- again- don't factor quiet time.
一旦有音频播放,就切断话筒。并假设正在听的人没有说话,则在播放最后一个可听音频帧后,在150-300毫秒内打开麦克风。
As soon as there's audio to play- cut the mike. And assuming the person listening is not talking, turn the mike on 150-300ms after the last audible audio frame comes in to be played.
如果来自麦克风的音频( (在播放过程中放下的音频)大于OH-(变量2 * 1.5),开始发送音频输入帧达指定的持续时间,每次输入幅度达到(variable2 * 1.5)时重置该持续时间。
If the audio from the microphones (that you're dropping during playback) is greater than oh- say (variable2 * 1.5), start sending the audio input frames for a specified duration, resetting that duration every time the input amplitude reaches (variable2 * 1.5).
这样,说话的人就会知道自己被打扰了,停下来看看别人在说什么。如果说话的人没有太多的背景噪音,他们可能会听到大部分(即使不是全部)干扰。
That way the person talking will know they are being interrupted, and stop to see what the person is saying. If the person talking doesn't have too noisy of a background, they will probably hear most if not all of the interruption.
就像我说的那样,不是最优雅的,但它不占用大量资源(CPU,内存),并且实际上工作得很好。我对我的声音感到非常满意。
Like I said, not the most graceful, but it doesn't use a lot of resources (CPU, memory) and it actually works pretty darn well. I am very pleased with how mine sounds.
要实现它,我做了一些功能。
To implement it, I just made a few functions.
在接收到的音频帧上,我调用一个函数,该函数称为:
On a received audio frame, I call a function I called:
void audioin( AEC *ec, short *frame ) {
unsigned int tas=0; /* Total sum of all audio in frame (absolute value) */
int i=0;
for (;i<160;i++)
tas+=ABS(frame[i]);
tas/=160; /* 320 byte frames muLaw */
if (tas>300) { /* I assume this is audiable */
lockecho(ec);
ec->lastaudibleframe=GetTickCount64();
unlockecho(ec);
}
return;
}
在发送帧之前,我会做:
and before sending a frame, I do:
#define ECHO_THRESHOLD 300 /* Time to keep suppression alive after last audible frame */
#define ONE_MINUTE 3000 /* 3000 20ms samples */
#define AVG_PERIOD 250 /* 250 20ms samples */
#define ABS(x) (x>0?x:-x)
char removeecho( AEC *ec, short *aecinput ) {
int tas=0; /* Average absolute amplitude in this signal */
int i=0;
unsigned long long *tot=0;
unsigned int *ctr=0;
unsigned short *avg=0;
char suppressframe=0;
lockecho(ec);
if (ec->lastaudibleframe+ECHO_THRESHOLD > GetTickCount64() ) {
/* If we're still within the threshold for echo (speaker state is ON) */
tot=&ec->t_aiws;
ctr=&ec->c_aiws;
avg=&ec->aiws;
} else {
/* If we're outside the threshold for echo (speaker state is OFF) */
tot=&ec->t_aiwos;
ctr=&ec->c_aiwos;
avg=&ec->aiwos;
}
for (;i<160;i++) {
tas+=ABS(aecinput[i]);
}
tas/=160;
if (tas>200) {
(*tot)+=tas;
(*avg)=(unsigned short)((*tot)/( (*ctr)?(*ctr):1));
(*ctr)++;
if ((*ctr)>AVG_PERIOD) {
(*tot)=(*avg);
(*ctr)=0;
}
}
if ( (avg==&ec->aiws) ) {
tas-=ec->aiwos;
if (tas<0) {
tas=0;
}
if ( ((unsigned short) tas > (ec->aiws*1.5)) && ((unsigned short)tas>=ec->aiwos) && (ec->aiwos!=0) ) {
suppressframe=0;
} else {
suppressframe=1;
}
}
if (suppressframe) { /* Silence frame */
memset(aecinput, 0, 320);
}
unlockecho(ec);
return suppressframe;
}
如果需要,它将使框架静音。我将所有变量(如计时器)和幅度平均值保留在AEC结构中,这是我从调用返回的
Which will silence the frame if it needs to. I keep all my variables, like the timers, and amplitude averages in the AEC struct, which I return from a call to
AEC *initecho( void ) {
AEC *ec=0;
ec=(AEC *)malloc(sizeof(AEC));
memset(ec, 0, sizeof(AEC));
ec->aiws=200; /* Just a default guess as to what the average amplitude would be */
return ec;
}
typedef struct aec {
unsigned long long lastaudibleframe; /* time stamp of last audible frame */
unsigned short aiws; /* Average mike input when speaker is playing */
unsigned short aiwos; /*Average mike input when speaker ISNT playing */
unsigned long long t_aiws, t_aiwos; /* Internal running total (sum of PCM) */
unsigned int c_aiws, c_aiwos; /* Internal counters for number of frames for averaging */
unsigned long lockthreadid; /* Thread ID with lock */
int stlc; /* Same thread lock-count */
} AEC;
您可以根据需要进行调整并付诸实践,但是就像我说的那样。实际上听起来不错。我唯一的问题是背景噪音是否很大。但是对我来说,如果他们拿起USB听筒或使用头戴式耳机,他们可以关闭回声消除功能,而不必担心...但是,尽管PC扬声器带有麦克风,但我对此感到非常满意。
You can adapt as you need to and play with the idea, but like I said. It actually sounds pretty dang good. The only problem I have is if they have a lot of background noise. But for me, if they pick up their USB handset or are using a headset, they can turn echo cancellation off, and not worry about it...but though PC speakers with a mike...I'm pretty happy with it.
我希望它会有所帮助,或者为您提供一些基础...
I hope it helps, or gives you something to build on...
这篇关于嵌入式软件中的回声消除(AEC)的文章就介绍到这了,希望我们推荐的答案对大家有所帮助,也希望大家多多支持IT屋!