如何使用Java来检测字/音频沉默的presence的wav文件? [英] how to detect a presence of word /audio silence in the wav file using java?

查看:211
本文介绍了如何使用Java来检测字/音频沉默的presence的wav文件?的处理方法,对大家解决问题具有一定的参考价值,需要的朋友们下面随着小编来一起学习吧!

问题描述

我工作的一个语音识别项目作为它的一部分从wav文件旺旺找到沉默的presence或检测字的presence。而如果一个词是发现然后复制从开始字结束到一个新的wav文件,因此原来的WAV文件有10个字则输出为10 file..problem与检测沉默或字
希望就如何落实这在java中的建议..
请建议..

I am working on a speech recognizer project as a part of it want from a wav file want to find the presence of silence or detect the presence of word .and if a word is found then copy that word from start to end into a new wav file so it original wav file has 10 words then output is 10 file..problem is with detecting the silence or word want suggestion on how to implement this in java.. please suggest..

推荐答案

好吧,WAV只是PCM数据。我通过阅读这篇开始:
http://en.wikipedia.org/wiki/Pulse-$c$c_modulation

Well, wav is just PCM data. I'd start by reading this: http://en.wikipedia.org/wiki/Pulse-code_modulation

我以前做过这个...
你开始成为拉动样品出来的PCM数据。然后,检查每一个,看它是否比您已设置的阈值时。比如假设16位采样...例如从零到15000的任何值是沉默,任何比15001更大的是健全的。只要记住对付无符号整数或你必须在PCM负。此外,请记住登录VS线性,当你与阈值玩耍。

I've done this before... You start be pulling samples out of the PCM data. You then check each to see if it is greater than a threshold values that you've set. For instance assuming 16 bit samples...Example any value from zero to 15000 is silence, anything greater than 15001 is sound. Just remember to deal with unsigned ints or you'll have negative in the PCM. Also, remember log vs linear when you're playing with the threshold.

这篇关于如何使用Java来检测字/音频沉默的presence的wav文件?的文章就介绍到这了,希望我们推荐的答案对大家有所帮助,也希望大家多多支持IT屋!

查看全文
登录 关闭
扫码关注1秒登录
发送“验证码”获取 | 15天全站免登陆