我如何区分在C#中演讲的人和正在播放的音乐以及演讲者在音频文件中开始演讲的时间 [英] How do I distinguish between someone speaking and music is being played in C# and the time at which speaker started its speech in an audio file

查看:72
本文介绍了我如何区分在C#中演讲的人和正在播放的音乐以及演讲者在音频文件中开始演讲的时间的处理方法,对大家解决问题具有一定的参考价值,需要的朋友们下面随着小编来一起学习吧!

问题描述

我如何理解音乐停止的时间以及有人开始在音频文件中说话。我需要为c#应用程序执行此操作



我尝试过:



i我没有得到如何解决这个问题。我试图在我的代码中实现语法,但是我如何在我的代码中插入如此多的词汇表呢?

How do i understand at what time music is getting stopped and someone started speaking in an audio file . i need to do this for c# application

What I have tried:

i am not getting how to solve this .i have tried to implement Grammar in my code but how can i insert so much of vocabulary to my code is there a way out for this .

推荐答案

你真的没有。内置语音库不支持听写。你需要一个第三方库,比如Dragon。
You don't really. The built in speech libraries do not support "dictation". You need a 3rd party library for this, such as Dragon.


当前的语音识别应用程序无法做到这一点:他们会尝试从任何类型的声音识别一些文本并产生乱码。

相反,你必须自己分析。最后,它类似于降噪。您需要一个包含音乐(在那里播放的音乐)的音频文件,以及包含扬声器(或其他人的)声音的音频文件。对音频文件进行快速傅里叶变换,然后将该性能的短片段的FFT结果与音乐和语音的FFT进行比较。
That cannot be done with current speech recognition applications: they will try to recognize some text from any kind of sound and yield gibberish.
Instead, you have to analyze that yourself. In the end, it is similar to noise reduction. You need an audio file containing music (of the kind played there), and an audio file containing the speaker's (or someone else's) voice. Do a Fast Fourier Transformation of the audio files, and then compare the FFT result of short snippets of that performance with the FFT of music and speech.


这篇关于我如何区分在C#中演讲的人和正在播放的音乐以及演讲者在音频文件中开始演讲的时间的文章就介绍到这了,希望我们推荐的答案对大家有所帮助,也希望大家多多支持IT屋!

查看全文
登录 关闭
扫码关注1秒登录
发送“验证码”获取 | 15天全站免登陆