你是怎么做到的双三次(或其他非线性)重新取样的音频数据插值? [英] How do you do bicubic (or other non-linear) interpolation of re-sampled audio data?

查看:271
本文介绍了你是怎么做到的双三次(或其他非线性)重新取样的音频数据插值?的处理方法,对大家解决问题具有一定的参考价值,需要的朋友们下面随着小编来一起学习吧!

问题描述

我在写一些code,可以播放WAV文件以不同的速度,使波或者是速度较慢而且高亢,或更快和更高的音调。我目前使用简单的线性插值,像这样:

I'm writing some code that plays back WAV files at different speeds, so that the wave is either slower and lower-pitched, or faster and higher-pitched. I'm currently using simple linear interpolation, like so:

            int newlength = (int)Math.Round(rawdata.Length * lengthMultiplier);
            float[] output = new float[newlength];

            for (int i = 0; i < newlength; i++)
            {
                float realPos = i / lengthMultiplier;
                int iLow = (int)realPos;
                int iHigh = iLow + 1;
                float remainder = realPos - (float)iLow;

                float lowval = 0;
                float highval = 0;
                if ((iLow >= 0) && (iLow < rawdata.Length))
                {
                    lowval = rawdata[iLow];
                }
                if ((iHigh >= 0) && (iHigh < rawdata.Length))
                {
                    highval = rawdata[iHigh];
                }

                output[i] = (highval * remainder) + (lowval * (1 - remainder));
            }

这工作得很好,但它往往OK音响只有当我降低了播放的频率(即慢下来)。如果我提出上播放球场上,这种方法往往会产生高频文物,presumably因为样本信息的丢失。

This works fine, but it tends to sound OK only when I lower the frequency of the playback (i.e. slow it down). If I raise the pitch on playback, this method tends to produce high-frequency artifacts, presumably because of the loss of sample information.

我知道双三次和其他插值方法重新取样使用不仅仅是两个最接近的样本值作为我的code例子多了,但我不能(pferably C#$ P $)找到任何好的code样本我可以插入到这里取代我的线性插值法。

I know that bicubic and other interpolation methods resample using more than just the two nearest sample values as in my code example, but I can't find any good code samples (C# preferably) that I could plug in to replace my linear interpolation method here.

有谁知道任何很好的例子,或者任何人都可以写一个简单的双三次插值的方法?如果我有,我会赏金这一点。 :)

Does anyone know of any good examples, or can anyone write a simple bicubic interpolation method? I'll bounty this if I have to. :)

更新:这里有几个插值方法(感谢甄子丹德波尔为第二第一个和nosredna)C#实现的:

Update: here are a couple of C# implementations of interpolation methods (thanks to Donnie DeBoer for the first one and nosredna for the second):

    public static float InterpolateCubic(float x0, float x1, float x2, float x3, float t)
    {
        float a0, a1, a2, a3;
        a0 = x3 - x2 - x0 + x1;
        a1 = x0 - x1 - a0;
        a2 = x2 - x0;
        a3 = x1;
        return (a0 * (t * t * t)) + (a1 * (t * t)) + (a2 * t) + (a3);
    }

    public static float InterpolateHermite4pt3oX(float x0, float x1, float x2, float x3, float t)
    {
        float c0 = x1;
        float c1 = .5F * (x2 - x0);
        float c2 = x0 - (2.5F * x1) + (2 * x2) - (.5F * x3);
        float c3 = (.5F * (x3 - x0)) + (1.5F * (x1 - x2));
        return (((((c3 * t) + c2) * t) + c1) * t) + c0;
    }

在这些功能中,X1是未来你想估计和x2就是贵点后的样品值的点的采样值。 X0是左X1,而X3是正确的X2。 ŧ从0到1,是您估计点和X1点之间的距离。

In these functions, x1 is the sample value ahead of the point you're trying to estimate and x2 is the sample value after your point. x0 is left of x1, and x3 is right of x2. t goes from 0 to 1 and is the distance between the point you're estimating and the x1 point.

埃尔米特方法似乎工作pretty好,似乎有所降低噪音。更重要的是它似乎音质更好,当波加快。

The Hermite method seems to work pretty well, and appears to reduce the noise somewhat. More importantly it seems to sound better when the wave is sped up.

推荐答案

我最喜欢的音频插入(特别是在重采样应用)资源的奥利Niemitalo的大象纸

My favorite resource for audio interpolating (especially in resampling applications) is Olli Niemitalo's "Elephant" paper.

我已经使用了这些夫妇和他们的声音了不起的(比直立方的解决方案,这是比较嘈杂的要好得多)。有花键形式,埃尔米特形式,Watte,抛物线等,他们是从的音频讨论的点的视图。这不仅是典型的天真的多项式拟合。

I've used a couple of these and they sound terrific (much better than a straight cubic solution, which is relatively noisy). There are spline forms, Hermite forms, Watte, parabolic, etc. And they are discussed from an audio point-of-view. This is not just your typical naive polynomial fitting.

和code包含!

要决定使用哪一种,你可能要到60页,其中组算法到运营商的复杂性(多少乘法,多少增加了)上启动与表。那么最好的信号与噪声的解决方案中进行选择 - 用你的耳朵为指导,以做出最终选择。的注:一般情况下,在 SNR越好

To decide which to use, you probably want to start with the table on page 60 which groups the algorithms into operator complexity (how many multiplies, and how many adds). Then choose among the best signal-to-noise solutions--use your ears as a guide to make the final choice. Note: Generally, the higher SNR, the better.

这篇关于你是怎么做到的双三次(或其他非线性)重新取样的音频数据插值?的文章就介绍到这了,希望我们推荐的答案对大家有所帮助,也希望大家多多支持IT屋!

查看全文
登录 关闭
扫码关注1秒登录
发送“验证码”获取 | 15天全站免登陆