您如何对重新采样的音频数据进行双三次(或其他非线性)插值? [英] How do you do bicubic (or other non-linear) interpolation of re-sampled audio data?

查看:24
本文介绍了您如何对重新采样的音频数据进行双三次(或其他非线性)插值?的处理方法,对大家解决问题具有一定的参考价值,需要的朋友们下面随着小编来一起学习吧!

问题描述

我正在编写一些代码,以不同的速度播放 WAV 文件,以便波形变慢和降低音调,或者更快和更高音调.我目前正在使用简单的线性插值,如下所示:

I'm writing some code that plays back WAV files at different speeds, so that the wave is either slower and lower-pitched, or faster and higher-pitched. I'm currently using simple linear interpolation, like so:

            int newlength = (int)Math.Round(rawdata.Length * lengthMultiplier);
            float[] output = new float[newlength];

            for (int i = 0; i < newlength; i++)
            {
                float realPos = i / lengthMultiplier;
                int iLow = (int)realPos;
                int iHigh = iLow + 1;
                float remainder = realPos - (float)iLow;

                float lowval = 0;
                float highval = 0;
                if ((iLow >= 0) && (iLow < rawdata.Length))
                {
                    lowval = rawdata[iLow];
                }
                if ((iHigh >= 0) && (iHigh < rawdata.Length))
                {
                    highval = rawdata[iHigh];
                }

                output[i] = (highval * remainder) + (lowval * (1 - remainder));
            }

这很好用,但只有当我降低播放频率(即放慢播放速度)时,它才会听起来不错.如果我在播放时提高音调,这种方法往往会产生高频伪影,大概是因为丢失了样本信息.

This works fine, but it tends to sound OK only when I lower the frequency of the playback (i.e. slow it down). If I raise the pitch on playback, this method tends to produce high-frequency artifacts, presumably because of the loss of sample information.

我知道双三次和其他插值方法使用的不仅仅是我的代码示例中的两个最接近的样本值来重新采样,但是我找不到任何好的代码示例(最好是 C#),我可以插入它来替换我的线性插值方法在这里.

I know that bicubic and other interpolation methods resample using more than just the two nearest sample values as in my code example, but I can't find any good code samples (C# preferably) that I could plug in to replace my linear interpolation method here.

有谁知道任何好的例子,或者谁能写一个简单的双三次插值方法?如果有必要,我会悬赏它.:)

Does anyone know of any good examples, or can anyone write a simple bicubic interpolation method? I'll bounty this if I have to. :)

更新:这里有两个插值方法的 C# 实现(感谢 Donnie DeBoer 的第一个和 nosredna 的第二个):

Update: here are a couple of C# implementations of interpolation methods (thanks to Donnie DeBoer for the first one and nosredna for the second):

    public static float InterpolateCubic(float x0, float x1, float x2, float x3, float t)
    {
        float a0, a1, a2, a3;
        a0 = x3 - x2 - x0 + x1;
        a1 = x0 - x1 - a0;
        a2 = x2 - x0;
        a3 = x1;
        return (a0 * (t * t * t)) + (a1 * (t * t)) + (a2 * t) + (a3);
    }

    public static float InterpolateHermite4pt3oX(float x0, float x1, float x2, float x3, float t)
    {
        float c0 = x1;
        float c1 = .5F * (x2 - x0);
        float c2 = x0 - (2.5F * x1) + (2 * x2) - (.5F * x3);
        float c3 = (.5F * (x3 - x0)) + (1.5F * (x1 - x2));
        return (((((c3 * t) + c2) * t) + c1) * t) + c0;
    }

在这些函数中,x1 是您尝试估计的点之前的样本值,x2 是您的点之后的样本值.x0在x1的左边,x3在x2的右边.t 从 0 到 1,是您估计的点与 x1 点之间的距离.

In these functions, x1 is the sample value ahead of the point you're trying to estimate and x2 is the sample value after your point. x0 is left of x1, and x3 is right of x2. t goes from 0 to 1 and is the distance between the point you're estimating and the x1 point.

Hermite 方法似乎工作得很好,并且似乎在一定程度上降低了噪音.更重要的是,当波浪加速时,它似乎听起来更好.

The Hermite method seems to work pretty well, and appears to reduce the noise somewhat. More importantly it seems to sound better when the wave is sped up.

推荐答案

我最喜欢的音频插值资源(尤其是在重采样应用中)是 Olli Niemitalo 的大象"论文.

My favorite resource for audio interpolating (especially in resampling applications) is Olli Niemitalo's "Elephant" paper.

我使用了其中的几个,它们听起来很棒(比直立方解决方案好得多,后者相对嘈杂).有样条形式、厄米形式、瓦特形式、抛物线形式等.并且从音频的角度讨论它们.这不仅仅是典型的朴素多项式拟合.

I've used a couple of these and they sound terrific (much better than a straight cubic solution, which is relatively noisy). There are spline forms, Hermite forms, Watte, parabolic, etc. And they are discussed from an audio point-of-view. This is not just your typical naive polynomial fitting.

并且包含代码!

要决定使用哪个,您可能需要从第 60 页的表格开始,该表格将算法分组为运算符复杂度(乘法数和加法数).然后在最佳信噪比解决方案中进行选择——以您的耳朵为指导做出最终选择.注意:通常,越高 SNR 越好.

To decide which to use, you probably want to start with the table on page 60 which groups the algorithms into operator complexity (how many multiplies, and how many adds). Then choose among the best signal-to-noise solutions--use your ears as a guide to make the final choice. Note: Generally, the higher SNR, the better.

这篇关于您如何对重新采样的音频数据进行双三次(或其他非线性)插值?的文章就介绍到这了,希望我们推荐的答案对大家有所帮助,也希望大家多多支持IT屋!

查看全文
登录 关闭
扫码关注1秒登录
发送“验证码”获取 | 15天全站免登陆