Android 中的语音活动检测 [英] Voice Activity Detection in Android

查看:44
本文介绍了Android 中的语音活动检测的处理方法,对大家解决问题具有一定的参考价值,需要的朋友们下面随着小编来一起学习吧!

问题描述

我正在编写一个应用程序,它的行为类似于现有的语音识别,但会将声音数据发送到专有网络服务以执行语音识别部分.我正在使用标准的 MediaRecord(它是 AMR-NB 编码的),它似乎非常适合语音识别.它提供的唯一数据是通过 getMaxAmplitude() 方法提供的 Amplitude.

I am writing an application that will behave similar to the existing Voice recognition but will be sending the sound data to a proprietary web service to perform the speech recognition part. I am using the standard MediaRecord (which is AMR-NB encoded) which seems to be perfect to speech recognition. The only data provided by this is the Amplitude via the getMaxAmplitude() method.

我试图检测此人何时开始说话,以便当此人停止说话约 2 秒钟时,我可以继续将声音数据发送到 Web 服务.现在我正在使用幅度阈值,如果它超过一个值(即 1500),那么我假设这个人正在说话.我担心幅度水平可能因设备(即 Nexus One v Droid)而异,因此我正在寻找一种更标准的方法,可以从幅度值中得出.

I am trying to detect when the person starts to talk so that when the person stops talking for about 2 seconds I can proceed to send the sound data to the web service. Right now I am using a threshold for the amplitude that if its goes over a value (i.e. 1500) then I assume the person is speaking. My concern is that the amplitude levels may vary by device (i.e. Nexus One v Droid), so I am looking for a more standard approach to this that can be derived from the amplitude values.

附言我查看了 graphing-amplitude,但它并没有提供一种仅使用幅度的方法.>

P.S. I looked at graphing-amplitude but it doesn't provide a way to do it with just the amplitude.

推荐答案

嗯,这可能没有多大帮助,但是如何从测量应用程序的设备麦克风捕获的偏移噪声开始,并应用阈值动态地基于那个?这样,您就可以使其适应不同设备的麦克风以及用户在给定时间使用它的环境.

Well, this might not be of much help but how about starting by measuring the offset noise captured by the microphone of the device by the application, and apply the threshold dynamically based on that? That way you would make it adaptable to the different devices' microphones and also to the environment the user is using it at, at a given time.

这篇关于Android 中的语音活动检测的文章就介绍到这了,希望我们推荐的答案对大家有所帮助,也希望大家多多支持IT屋!

查看全文
登录 关闭
扫码关注1秒登录
发送“验证码”获取 | 15天全站免登陆