Python 说话人识别 [英] Python Speaker Recognition

查看:43
本文介绍了Python 说话人识别的处理方法,对大家解决问题具有一定的参考价值,需要的朋友们下面随着小编来一起学习吧!

问题描述

我有一个音频文件,一个 2 人的电话录音,我需要自动分离 2 个扬声器的声音.我是语音识别的新手,我查看了 python 的 wave 模块,但没有找到任何有用的信息.

I have an audio file, a recorded telephone conversation of 2 people, that I need to separate the voices of 2 speakers automatically. I am new to speech recognition and I looked at wave module of python but failed to find any fruitful information.

请帮助如何开始.还请建议我免费的 Python 库,这将帮助我解决问题.

Please help how to start. Also please suggest me free python libraries which will help me in solving the problem.

推荐答案

分离说话人的任务不是语音识别任务,而是说话人识别任务.在语音社区中,此任务也称为说话人分类.有几个可用于 Python 的说话人分类和说话人识别包:

The task of separation of the speakers is not a speech recognition task, it's a speaker recognition task. In the speech comminity this task is also known as speaker diarization. There are several packages for speaker diarization and speaker recognition available for Python:

LIUM 的 SIDEKIT

来自 Idiap 的 Bob 工具包

来自 ISCI 的演讲者分类

如果您不仅限于 Python,还有其他的:

In case you are not restricted to Python, there are others:

LIUM 扬声器分类

Kaldi 中的扬声器识别设置.包括最先进的基于 DNN 的 i 向量,称为 x 向量.

Speaker recognition setup in Kaldi. Includes state of the art DNN-based i-vectors called x-vectors.

这篇关于Python 说话人识别的文章就介绍到这了,希望我们推荐的答案对大家有所帮助,也希望大家多多支持IT屋!

查看全文
登录 关闭
扫码关注1秒登录
发送“验证码”获取 | 15天全站免登陆