Web音频分析器的getFloatTimeDomainData缓冲区偏移量,其他时间的wrt缓冲区和“完整文件"的wrt缓冲区 [英] web audio analyser's getFloatTimeDomainData buffer offset wrt buffers at other times and wrt buffer of 'complete file'

查看:187
本文介绍了Web音频分析器的getFloatTimeDomainData缓冲区偏移量,其他时间的wrt缓冲区和“完整文件"的wrt缓冲区的处理方法,对大家解决问题具有一定的参考价值,需要的朋友们下面随着小编来一起学习吧!

问题描述

(将问题的答案重写为问题,并使其更加简洁.)

(question rewritten integrating bits of information from answers, plus making it more concise.)

我使用analyser=audioContext.createAnalyser()来处理音频数据,并且我试图更好地理解细节.

I use analyser=audioContext.createAnalyser() in order to process audio data, and I'm trying to understand the details better.

我选择一个fftSize,例如2048,然后使用Float32Array创建一个2048浮点数的数组buffer,然后在动画循环中 (通过window.requestAnimationFrame在大多数计算机上每秒被称为60次),我知道

I choose an fftSize, say 2048, then I create an array buffer of 2048 floats with Float32Array, and then, in an animation loop (called 60 times per second on most machines, via window.requestAnimationFrame), I do

analyser.getFloatTimeDomainData(buffer);

它将用2048个浮点样本数据点填充我的缓冲区.

which will fill my buffer with 2048 floating point sample data points.

下次调用处理程序时,已过去1/60秒.要计算以样本为单位的数量, 我们必须将其除以1个样本的持续时间,得到(1/60)/(1/44100)= 735. 因此,下一次处理程序调用(平均)要在735个样本之后进行.

When the handler is called the next time, 1/60 second has passed. To calculate how much that is in units of samples, we have to divide it by the duration of 1 sample, and get (1/60)/(1/44100) = 735. So the next handler call takes place (on average) 735 samples later.

因此后续缓冲区之间有重叠,如下所示:

So there is overlap between subsequent buffers, like this:

我们从规范(搜索渲染量子")中了解到发生在块大小"(128)的倍数中. 因此(就音频处理而言),人们希望下一个处理程序调用通常是5 * 128 = 640个样本, 否则6 * 128 = 768个样本-这些是最接近735个样本的128的倍数=(1/60)秒.

We know from the spec (search for 'render quantum') that everything happens in "chunck sizes" which are multiples of 128. So (in terms of audio processing), one would expect that the next handler call will usually be either 5*128 = 640 samples later, or else 6*128 = 768 samples later - those being the multiples of 128 closest to 735 samples = (1/60) second.

将此金额称为Δ-样本",如何确定(在每次处理程序调用期间)640或768或其他值是什么?

Calling this amount "Δ-samples", how do I find out what it is (during each handler call), 640 or 768 or something else?

可靠地像这样:

考虑旧缓冲区"(来自先前的处理程序调用).如果在开始时删除Δ-样本"许多样本,请复制其余部分,然后在Δ-样本"后面附加许多新样本,这些样本应为当前缓冲区.确实,我尝试过 就是这种情况.事实证明,Δ样本"通常为384、512、896.确定琐碎但耗时 循环中的Δ-样本".

Consider the 'old buffer' (from previous handler call). If you delete "Δ-samples" many samples at the beginning, copy the remainder, and then append "Δ-samples" many new samples, that should be the current buffer. And indeed, I tried that, and that is the case. It turns out "Δ-samples" often is 384, 512, 896. It is trivial but time consuming to determine "Δ-samples" in a loop.

我想在不执行该循环的情况下计算Δ-样本".

I would like to compute "Δ-samples" without performing that loop.

人们会认为以下方法会起作用:

One would think the following would work:

(audioContext.currentTime()-(上次处理程序运行期间audioContext.currentTime()的结果))/(1个样本的持续时间)

(audioContext.currentTime() - (result of audioContext.currentTime() during last time handler ran))/(duration of 1 sample)

我尝试过(请参见下面的代码,在这里我还将各个缓冲区拼合"在一起,试图重建原始缓冲区), 而且-令人惊讶的是-它在Chrome中的运行时间约为99.9%,在Firefox中的运行时间约为95%.

I tried that (see code below where I also "stich together" the various buffers, trying to reconstruct the original buffer), and - surprise - it works about 99.9% of the time in Chrome, and about 95% of the time in Firefox.

我还尝试了audioContent.getOutputTimestamp().contextTime,它在Chrome中不起作用,在Firefox中工作9%.

I also tried audioContent.getOutputTimestamp().contextTime, which does not work in Chrome, and works 9?% in Firefox.

是否可以找到可靠工作的Δ-样本"(无需查看缓冲区)?

Is there any way to find "Δ-samples" (without looking at the buffers), which works reliably?

第二个问题,重构的"缓冲区(将回调中的所有缓冲区缝合在一起)和原始声音缓冲区 并不完全相同,存在一些差异(但很小,但比平时的舍入误差"大得多),而在Firefox中则更大.

Second question, the "reconstructed" buffer (all the buffers from callbacks stitched together), and the original sound buffer are not exactly the same, there is some (small, but noticable, more than usual "rounding error") difference, and that is bigger in Firefox.

那是哪里来的? -您知道,据我了解,规格应该相同.

Where does that come from? - You know, as I understand the spec, those should be the same.

var soundFile = 'https://mathheadinclouds.github.io/audio/sounds/la.mp3';
var audioContext = null;
var isPlaying = false;
var sourceNode = null;
var analyser = null;
var theBuffer = null;
var reconstructedBuffer = null;
var soundRequest = null;
var loopCounter = -1;
var FFT_SIZE = 2048;
var rafID = null;
var buffers = [];
var timesSamples = [];
var timeSampleDiffs = [];
var leadingWaste = 0;

window.addEventListener('load', function() {
  soundRequest = new XMLHttpRequest();
  soundRequest.open("GET", soundFile, true);
  soundRequest.responseType = "arraybuffer";
  //soundRequest.onload = function(evt) {}
  soundRequest.send();
  var btn = document.createElement('button');
  btn.textContent = 'go';
  btn.addEventListener('click', function(evt) {
    goButtonClick(this, evt)
  });
  document.body.appendChild(btn);
});

function goButtonClick(elt, evt) {
  initAudioContext(togglePlayback);
  elt.parentElement.removeChild(elt);
}

function initAudioContext(callback) {
  audioContext = new AudioContext();
  audioContext.decodeAudioData(soundRequest.response, function(buffer) {
    theBuffer = buffer;
    callback();
  });
}

function createAnalyser() {
  analyser = audioContext.createAnalyser();
  analyser.fftSize = FFT_SIZE;
}

function startWithSourceNode() {
  sourceNode.connect(analyser);
  analyser.connect(audioContext.destination);
  sourceNode.start(0);
  isPlaying = true;
  sourceNode.addEventListener('ended', function(evt) {
    sourceNode = null;
    analyser = null;
    isPlaying = false;
    loopCounter = -1;
    window.cancelAnimationFrame(rafID);
    console.log('buffer length', theBuffer.length);
    console.log('reconstructedBuffer length', reconstructedBuffer.length);
    console.log('audio callback called counter', buffers.length);
    console.log('root mean square error', Math.sqrt(checkResult() / theBuffer.length));
    console.log('lengths of time between requestAnimationFrame callbacks, measured in audio samples:');
    console.log(timeSampleDiffs);
    console.log(
      timeSampleDiffs.filter(function(val) {
        return val === 384
      }).length,
      timeSampleDiffs.filter(function(val) {
        return val === 512
      }).length,
      timeSampleDiffs.filter(function(val) {
        return val === 640
      }).length,
      timeSampleDiffs.filter(function(val) {
        return val === 768
      }).length,
      timeSampleDiffs.filter(function(val) {
        return val === 896
      }).length,
      '*',
      timeSampleDiffs.filter(function(val) {
        return val > 896
      }).length,
      timeSampleDiffs.filter(function(val) {
        return val < 384
      }).length
    );
    console.log(
      timeSampleDiffs.filter(function(val) {
        return val === 384
      }).length +
      timeSampleDiffs.filter(function(val) {
        return val === 512
      }).length +
      timeSampleDiffs.filter(function(val) {
        return val === 640
      }).length +
      timeSampleDiffs.filter(function(val) {
        return val === 768
      }).length +
      timeSampleDiffs.filter(function(val) {
        return val === 896
      }).length
    )
  });
  myAudioCallback();
}

function togglePlayback() {
  sourceNode = audioContext.createBufferSource();
  sourceNode.buffer = theBuffer;
  createAnalyser();
  startWithSourceNode();
}

function myAudioCallback(time) {
  ++loopCounter;
  if (!buffers[loopCounter]) {
    buffers[loopCounter] = new Float32Array(FFT_SIZE);
  }
  var buf = buffers[loopCounter];
  analyser.getFloatTimeDomainData(buf);
  var now = audioContext.currentTime;
  var nowSamp = Math.round(audioContext.sampleRate * now);
  timesSamples[loopCounter] = nowSamp;
  var j, sampDiff;
  if (loopCounter === 0) {
    console.log('start sample: ', nowSamp);
    reconstructedBuffer = new Float32Array(theBuffer.length + FFT_SIZE + nowSamp);
    leadingWaste = nowSamp;
    for (j = 0; j < FFT_SIZE; j++) {
      reconstructedBuffer[nowSamp + j] = buf[j];
    }
  } else {
    sampDiff = nowSamp - timesSamples[loopCounter - 1];
    timeSampleDiffs.push(sampDiff);
    var expectedEqual = FFT_SIZE - sampDiff;
    for (j = 0; j < expectedEqual; j++) {
      if (reconstructedBuffer[nowSamp + j] !== buf[j]) {
        console.error('unexpected error', loopCounter, j);
        // debugger;
      }
    }
    for (j = expectedEqual; j < FFT_SIZE; j++) {
      reconstructedBuffer[nowSamp + j] = buf[j];
    }
    //console.log(loopCounter, nowSamp, sampDiff);
  }
  rafID = window.requestAnimationFrame(myAudioCallback);
}

function checkResult() {
  var ch0 = theBuffer.getChannelData(0);
  var ch1 = theBuffer.getChannelData(1);
  var sum = 0;
  var idxDelta = leadingWaste + FFT_SIZE;
  for (var i = 0; i < theBuffer.length; i++) {
    var samp0 = ch0[i];
    var samp1 = ch1[i];
    var samp = (samp0 + samp1) / 2;
    var check = reconstructedBuffer[i + idxDelta];
    var diff = samp - check;
    var sqDiff = diff * diff;
    sum += sqDiff;
  }
  return sum;
}

在上面的代码段中,我执行以下操作.我在github.io页面上从XMLHttpRequest加载了一个1秒的mp3音频文件(我唱'la'1秒).加载后,显示一个按钮,说"go",按下该按钮后,将音频放到bufferSource节点中,然后在其上执行.start,以播放音频. bufferSource是馈送给我们的分析器等的

In above snippet, I do the following. I load with XMLHttpRequest a 1 second mp3 audio file from my github.io page (I sing 'la' for 1 second). After it has loaded, a button is shown, saying 'go', and after pressing that, the audio is played back by putting it into a bufferSource node and then doing .start on that. the bufferSource is the fed to our analyser, et cetera

相关问题

我的github.io页面上也有代码段 -使阅读控制台更加容易.

I also have the snippet code on my github.io page - makes reading the console easier.

推荐答案

不幸的是,无法找到捕获AnalyserNode返回的数据的确切时间点.但是您可能会沿着目前的方法走上正确的轨道.

Unfortunately there is no way to find out the exact point in time at which the data returned by an AnalyserNode was captured. But you might be on the right track with your current approach.

AnalyserNode返回的所有值均基于.这基本上是AnalyserNode在特定时间点的内部缓冲区.由于Web Audio API具有128个样本的固定渲染范围,因此我希望此缓冲区也以128个样本为步长进行演化.但是currentTime通常已经以128个样本为步长进行进化.

All the values returned by the AnalyserNode are based on the "current-time-domain-data". This is basically the internal buffer of the AnalyserNode at a certain point in time. Since the Web Audio API has a fixed render quantum of 128 samples I would expect this buffer to evolve in steps of 128 samples as well. But currentTime usually evolves in steps of 128 samples already.

此外,AnalyserNode具有 smoothingTimeConstant 属性.它负责模糊"返回的值.默认值为0.8.对于您的用例,您可能要将其设置为0.

Furthermore the AnalyserNode has a smoothingTimeConstant property. It is responsible for "blurring" the returned values. The default value is 0.8. For your use case you probably want to set this to 0.

编辑:正如雷蒙德·玩具(Raymond Toy)在评论中指出的那样,smoothingtimeconstant仅对频率数据有影响.由于问题是关于getFloatTimeDomainData()的,因此它对返回的值没有影响.

EDIT: As Raymond Toy pointed out in the comments the smoothingtimeconstant only has an effect on the frequency data. Since the question is about getFloatTimeDomainData() it will have no effect on the returned values.

我希望这会有所帮助,但我认为使用AudioWorklet获取音频信号的所有样本会更容易.肯定会更可靠.

I hope this helps but I think it would be easier to get all the samples of your audio signal by using an AudioWorklet. It would definitely be more reliable.

这篇关于Web音频分析器的getFloatTimeDomainData缓冲区偏移量,其他时间的wrt缓冲区和“完整文件"的wrt缓冲区的文章就介绍到这了,希望我们推荐的答案对大家有所帮助,也希望大家多多支持IT屋!

查看全文
登录 关闭
扫码关注1秒登录
发送“验证码”获取 | 15天全站免登陆