Python 说话人识别 [英] Python Speaker Recognition
问题描述
我有一个音频文件,一个 2 人的电话录音,我需要自动分离 2 个扬声器的声音.我是语音识别的新手,我查看了 python 的 wave 模块,但没有找到任何有用的信息.
I have an audio file, a recorded telephone conversation of 2 people, that I need to separate the voices of 2 speakers automatically. I am new to speech recognition and I looked at wave module of python but failed to find any fruitful information.
请帮助如何开始.还请建议我免费的 Python 库,这将帮助我解决问题.
Please help how to start. Also please suggest me free python libraries which will help me in solving the problem.
推荐答案
分离说话人的任务不是语音识别任务,而是说话人识别任务.在语音社区中,此任务也称为说话人分类.有几个可用于 Python 的说话人分类和说话人识别包:
The task of separation of the speakers is not a speech recognition task, it's a speaker recognition task. In the speech comminity this task is also known as speaker diarization. There are several packages for speaker diarization and speaker recognition available for Python:
如果您不仅限于 Python,还有其他的:
In case you are not restricted to Python, there are others:
Kaldi 中的扬声器识别设置.包括最先进的基于 DNN 的 i 向量,称为 x 向量.
Speaker recognition setup in Kaldi. Includes state of the art DNN-based i-vectors called x-vectors.
这篇关于Python 说话人识别的文章就介绍到这了,希望我们推荐的答案对大家有所帮助,也希望大家多多支持IT屋!