如何合并MFCC [英] How to Merge MFCCs

查看：154 发布时间：2020/6/30 21:11:49 java audio feature-extraction mfcc tarsosdsp

本文介绍了如何合并MFCC的处理方法，对大家解决问题具有一定的参考价值，需要的朋友们下面随着小编来一起学习吧！

问题描述

我正在从一些音频文件中提取MFCC功能.我目前使用的程序为每个文件提取了一系列MFCC，并且其缓冲区大小为1024.在论文中，我看到了以下内容:

I am working on extracting MFCC features from some audio files. The program I have currently extracts a series of MFCCs for each file and has a parameter of a buffer size of 1024. I saw the following in a paper:

在一秒钟的音频数据中提取的特征向量通过计算每个特征向量元素的均值和方差进行合并(合并).

The feature vectors extracted within a second of audio data are combined by computing the mean and the variance of each feature vector element (merging).

我当前的代码使用TarsosDSP提取MFCC，但是我不确定如何将数据拆分为一秒钟的音频数据"以合并MFCC.

My current code uses TarsosDSP to extract the MFCCs, but I'm not sure how to split the data into "a second of audio data" in order to merge the MFCCs.

int sampleRate = 44100;
int bufferSize = 1024;
int bufferOverlap = 512;
inStream = new FileInputStream(path);
AudioDispatcher dispatcher = new AudioDispatcher(new UniversalAudioInputStream(inStream, new TarsosDSPAudioFormat(sampleRate, 16, 1, true, true)), bufferSize, bufferOverlap);
final MFCC mfcc = new MFCC(bufferSize, sampleRate, 13, 40, 300, 3000);
dispatcher.addAudioProcessor(mfcc);
dispatcher.addAudioProcessor(new AudioProcessor() {
    @Override
    public void processingFinished() {
        System.out.println("DONE");
    }
    @Override
    public boolean process(AudioEvent audioEvent) {
        return true;  // breakpoint here reveals MFCC data
    }
});
dispatcher.run();

缓冲区的大小到底是什么?它可以用于将音频分割为1秒的窗口吗?有没有一种方法可以将一系列MFCC划分为一定的时间?

What exactly is buffer size and could it be used to segment the audio into windows of 1 second? Is there a method to divide the series of MFCCs into certain amounts of time?

任何帮助将不胜感激.

如何合并MFCC [英] How to Merge MFCCs

问题描述

推荐答案

相关文章

Java开发最新文章

热门教程

热门工具

登录关闭

如何合并MFCC [英] How to Merge MFCCs

问题描述

推荐答案

相关文章

Java开发最新文章

热门教程

热门工具

登录 关闭

登录关闭