互通AudioBuffer.mData以显示音频可视化 [英] Interperating AudioBuffer.mData to display audio visualization

查看:221
本文介绍了互通AudioBuffer.mData以显示音频可视化的处理方法,对大家解决问题具有一定的参考价值,需要的朋友们下面随着小编来一起学习吧!

问题描述

我正在尝试实时处理音频数据,以便可以基于从麦克风输入的声音在屏幕上显示频谱分析仪/可视化效果.我正在使用AVFoundation的AVCaptureAudioDataOutputSampleBufferDelegate来捕获音频数据,这将触发代理功能captureOutput.功能如下:

I am trying to process audio data in real-time so that I can display an on-screen spectrum analyzer/visualization based on sound input from the microphone. I am using AVFoundation's AVCaptureAudioDataOutputSampleBufferDelegate to capture the audio data, which is triggering the delgate function captureOutput. Function below:

func captureOutput(_ output: AVCaptureOutput, didOutput sampleBuffer: CMSampleBuffer, from connection: AVCaptureConnection) {

    autoreleasepool {

        guard captureOutput != nil,
            sampleBuffer != nil,
            connection != nil,
            CMSampleBufferDataIsReady(sampleBuffer) else { return }

        //Check this is AUDIO (and not VIDEO) being received
        if (connection.audioChannels.count > 0)
        {
            //Determine number of frames in buffer
            var numFrames = CMSampleBufferGetNumSamples(sampleBuffer)

            //Get AudioBufferList
            var audioBufferList = AudioBufferList(mNumberBuffers: 1, mBuffers: AudioBuffer(mNumberChannels: 0, mDataByteSize: 0, mData: nil))
            var blockBuffer: CMBlockBuffer?
          CMSampleBufferGetAudioBufferListWithRetainedBlockBuffer(sampleBuffer, nil, &audioBufferList, MemoryLayout<AudioBufferList>.size, nil, nil, UInt32(kCMSampleBufferFlag_AudioBufferList_Assure16ByteAlignment), &blockBuffer)

            let audioBuffers = UnsafeBufferPointer<AudioBuffer>(start: &audioBufferList.mBuffers, count: Int(audioBufferList.mNumberBuffers))

            for audioBuffer in audioBuffers {
                let data = Data(bytes: audioBuffer.mData!, count: Int(audioBuffer.mDataByteSize))

                let i16array = data.withUnsafeBytes {
                    UnsafeBufferPointer<Int16>(start: $0, count: data.count/2).map(Int16.init(bigEndian:))
                }

                for dataItem in i16array
                {
                    print(dataItem)
                }

            }

        }
    }
}

上面的代码按预期打印了类型为Int16的正数和负数,但是需要帮助将这些原始数转换为有意义的数据,例如用于我的可视化仪的幂和分贝.

The code above prints positive and negative numbers of type Int16 as expected, but need help in converting these raw numbers into meaningful data such as power and decibels for my visualizer.

推荐答案

我走在正确的道路上...感谢 RobertHarvey 对我的问题的评论-使用Accelerate Framework的FFT计算功能是实现频谱分析仪所必需的.但是即使在使用这些功能之前,您也需要将原始数据转换为类型为FloatArray,因为许多功能都需要一个Float数组.

I was on the right track... Thanks to RobertHarvey's comment on my question - Use of the Accelerate Framework's FFT calculation functions is required to achieve a spectrum analyzer. But even before I could use these functions, you need to convert your raw data into an Array of type Float as many of the functions require a Float array.

首先,我们将原始数据加载到Data对象中:

Firstly, we load the raw data into a Data object:

//Read data from AudioBuffer into a variable
let data = Data(bytes: audioBuffer.mData!, count: Int(audioBuffer.mDataByteSize))

我喜欢将Data对象视为1字节大小的信息(每个8位)的列表",但是如果我检查样本中的帧数以及我的Data对象(以字节为单位),它们不匹配:

I like to think of a Data object as a "list" of 1-byte sized chunks of info (8 bits each), but if I check the number of frames I have in my sample and the total size of my Data object in bytes, they don't match:

//Get number of frames in sample and total size of Data
var numFrames = CMSampleBufferGetNumSamples(sampleBuffer) //= 1024 frames in my case
var dataSize = audioBuffer.mDataByteSize //= 2048 bytes in my case

我的数据的总大小(以字节为单位)是我在CMSampleBuffer中拥有的帧数的两倍.这意味着音频的每个帧的长度为2个字节.为了有意义地读取数据,我需要将我的Data对象(它是1字节块的列表")转换为2字节块的数组. Int16包含16位(或2个字节-正是我们所需要的),因此让我们创建Int16Array:

The total size (in bytes) of my data is twice the number of frames I have in my CMSampleBuffer. This means that each frame of audio is 2 bytes in length. In order to read the data meaningfully, I need to convert my Data object which is a "list" of 1-byte chunks into an array of 2-byte chunks. Int16 contains 16 bits (or 2 bytes - exactly what we need), so lets create an Array of Int16:

//Convert to Int16 array
let samples = data.withUnsafeBytes {
    UnsafeBufferPointer<Int16>(start: $0, count: data.count / MemoryLayout<Int16>.size)
}

现在我们的ArrayInt16,我们可以将其转换为ArrayFloat:

Now that we have an Array of Int16, we can convert it to an Array of Float:

//Convert to Float Array
let factor = Float(Int16.max)
var floats: [Float] = Array(repeating: 0.0, count: samples.count)
for i in 0..<samples.count {
    floats[i] = Float(samples[i]) / factor
}

现在有了我们的Float数组,我们现在可以使用Accelerate Framework的复杂数学将原始的Float值转换为有意义的值,例如幅度,分贝等.链接到文档:

Now that we have our Float array, we can now use the Accelerate Framework's complex math to convert the raw Float values into meaningful ones like magnitude, decibels etc. Link to documentation:

苹果的加速框架

快速傅里叶变换(FFT)

我发现Apple的文档不胜枚举.幸运的是,我在网上找到了一个非常好的示例,可以根据自己的需要重新使用它,名为 TempiFFT .实现如下:

I found Apple's documentation rather overwhelming. Luckily, I found a really good example online which I was able to re-purpose for my needs, called TempiFFT. Implementation as follows:

//Initiate FFT
let fft = TempiFFT(withSize: numFrames, sampleRate: 44100.0)
fft.windowType = TempiFFTWindowType.hanning

//Pass array of Floats
fft.fftForward(floats)

//I only want to display 20 bands on my analyzer
fft.calculateLinearBands(minFrequency: 0, maxFrequency: fft.nyquistFrequency, numberOfBands: 20)

//Then use a loop to iterate through the bands in your spectrum analyzer
var magnitudeArr = [Float](repeating: Float(0), count: 20)
var magnitudeDBArr = [Float](repeating: Float(0), count: 20)
for i in 0..<20
{
    var magnitudeArr[i] = fft.magnitudeAtBand(i)
    var magnitudeDB = TempiFFT.toDB(fft.magnitudeAtBand(i))
    //..I didn't, but you could perform drawing functions here...
}

其他有用的参考文献:

将数据转换为Int16数组

查看全文

登录 关闭
扫码关注1秒登录
发送“验证码”获取 | 15天全站免登陆