如何实时从麦克风或从iOS中保存的音频文件中获取原始音频帧? [英] How can I obtain the raw audio frames from the microphone in real-time or from a saved audio file in iOS?

查看:388
本文介绍了如何实时从麦克风或从iOS中保存的音频文件中获取原始音频帧?的处理方法,对大家解决问题具有一定的参考价值,需要的朋友们下面随着小编来一起学习吧!

问题描述

我正在尝试从音频信号中提取MFCC向量作为输入到递归神经网络中.但是,我在弄清楚如何使用Core Audio在Swift中获取原始音频帧时遇到了麻烦.大概我必须低级获取数据,但是我在这方面找不到有用的资源.

I am trying to extract MFCC vectors from the audio signal as input into a recurrent neural network. However, I am having trouble figuring out how to obtain the raw audio frames in Swift using Core Audio. Presumably, I have to go low-level to get that data, but I cannot find helpful resources in this area.

如何使用Swift获取所需的音频信号信息?

How can I get the audio signal information that I need using Swift?

编辑:此问题被标记为

Edit: This question was flagged as a possible duplicate of How to capture audio samples in iOS with Swift?. However, that particular question does not have the answer that I am looking for. Namely, the solution to that question is the creation of an AVAudioRecorder, which is a component, not the end result, of a solution to my question.

此问题如何转换WAV/将CAF文件的示例数据转换为字节数组?的方向更多.解决方案是用Objective-C编写的,我想知道是否有一种方法可以在Swift中完成.

This question How to convert WAV/CAF file's sample data to byte array? is more in the direction of where I am headed. The solutions to that are written in Objective-C, and I am wondering if there is a way to do it in Swift.

推荐答案

将抽头连接到AVAudioEngine上的默认输入节点非常简单,它将以Float32阵列的形式从麦克风实时获得约100ms的音频块.您甚至不必连接任何其他音频单元.如果您的MFCC提取器&网络具有足够的响应能力,这可能是最简单的方法.

Attaching a tap to the default input node on AVAudioEngine is pretty straightforward and will get you real-time ~100ms chunks of audio from the microphone as Float32 arrays. You don't even have to connect any other audio units. If your MFCC extractor & network are sufficiently responsive this may be the easiest way to go.

let audioEngine = AVAudioEngine()
if let inputNode = audioEngine.inputNode {
    inputNode.installTap( onBus: 0,         // mono input
                          bufferSize: 1000, // a request, not a guarantee
                          format: nil,      // no format translation
                          block: { buffer, when in 

        // This block will be called over and over for successive buffers 
        // of microphone data until you stop() AVAudioEngine
        let actualSampleCount = Int(buffer.frameLength)

        // buffer.floatChannelData?.pointee[n] has the data for point n
        var i=0
        while (i < actualSampleCount) {
            let val = buffer.floatChannelData?.pointee[i]
            // do something to each sample here...
            i += 1
        }
    })

    do {
        try audioEngine.start()
    } catch let error as NSError {
        print("Got an error starting audioEngine: \(error.domain), \(error)")
    }
}

您还需要请求并获得麦克风许可.

You will need to request and obtain microphone permission as well.

我发现幅度相当低,因此您可能需要根据网络的需要应用一些增益或归一化.

I find the amplitudes to be rather low, so you may need to apply some gain or normalization depending on your network's needs.

要处理您的WAV文件,我会尝试AVAssetReader,尽管我手头没有代码.

To process your WAV files, I'd try AVAssetReader, though I don't have code at hand for that.

这篇关于如何实时从麦克风或从iOS中保存的音频文件中获取原始音频帧?的文章就介绍到这了,希望我们推荐的答案对大家有所帮助,也希望大家多多支持IT屋!

查看全文
登录 关闭
扫码关注1秒登录
发送“验证码”获取 | 15天全站免登陆