一个作图声音的音调(频率) [英] Graphing the pitch (frequency) of a sound

查看:813
本文介绍了一个作图声音的音调(频率)的处理方法,对大家解决问题具有一定的参考价值,需要的朋友们下面随着小编来一起学习吧!

问题描述

我要一个声音的音调积成图。

I want to plot the pitch of a sound into a graph.

目前我能绘出幅度。图下方是由 getUnscaledAmplitude返回的数据创建()

Currently I can plot the amplitude. The graph below is created by the data returned by getUnscaledAmplitude():

AudioInputStream audioInputStream = AudioSystem.getAudioInputStream(new BufferedInputStream(new FileInputStream(file)));
byte[] bytes = new byte[(int) (audioInputStream.getFrameLength()) * (audioInputStream.getFormat().getFrameSize())];
audioInputStream.read(bytes);

// Get amplitude values for each audio channel in an array.
graphData = type.getUnscaledAmplitude(bytes, 1);


public int[][] getUnscaledAmplitude(byte[] eightBitByteArray, int nbChannels)
{
    int[][] toReturn = new int[nbChannels][eightBitByteArray.length / (2 * nbChannels)];
    int index = 0;

    for (int audioByte = 0; audioByte < eightBitByteArray.length;)
    {
        for (int channel = 0; channel < nbChannels; channel++)
        {
            // Do the byte to sample conversion.
            int low = (int) eightBitByteArray[audioByte];
            audioByte++;
            int high = (int) eightBitByteArray[audioByte];
            audioByte++;
            int sample = (high << 8) + (low & 0x00ff);

            toReturn[channel][index] = sample;
        }
        index++;
    }

    return toReturn;
}

但我需要显示音频的音调,而不是幅度。 快速傅立叶变换似乎得到在球场上,但需要知道的比我的原始字节更多的变量,而且是非常复杂和数学。

But I need to show the audio's pitch, not amplitude. Fast Fourier transform appears to get the pitch, but it needs to know more variables than the raw bytes I have, and is very complex and mathematical.

有没有一种方法可以让我做到这一点?

Is there a way I can do this?

推荐答案

频率的(一个客观的衡量标准)不一样的音调的(主观数量)。一般情况下,基音检测是一个非常棘手的问题。

Frequency (an objective metric) is not the same as pitch (a subjective quantity). In general, pitch detection is a very tricky problem.

假设你只是想绘制对于现在的频率响应,你别无选择,只能使用FFT,因为它的的方法来获得时域数据的频率响应。 (当然,还有其他的方法,如离散余弦变换,但他们同样棘手的实施,更加棘手间preT)。

Assuming you just want to graph the frequency response for now, you have little choice but to use the FFT, as it is THE method to obtain the frequency response of time-domain data. (Well, there are other methods, such as the discrete cosine transform, but they're just as tricky to implement, and more tricky to interpret).

如果您正在使用FFT的执行挣扎,注意它实际上只是用于计算离散傅里叶变换(DFT)的高效算法;看<一个href=\"http://en.wikipedia.org/wiki/Discrete_Fourier_transform\">http://en.wikipedia.org/wiki/Discrete_Fourier_transform.基本的DFT算法是容易得多(只有两个嵌套循环),但运行的很多的速度较慢(O(N ^ 2),而不是O(N日志N))。

If you're struggling with the implementation of the FFT, note that it's really just an efficient algorithm for calculating the discrete Fourier transform (DFT); see http://en.wikipedia.org/wiki/Discrete_Fourier_transform. The basic DFT algorithm is much easier (just two nested loops), but runs a lot slower (O(N^2) rather than O(N log N)).

如果你希望做任何事情更复杂的不是简单地绘制频率成分(如沥青检测或窗口(如其他人所说)),我怕你将不得不学习数学的意思。

If you wish to do anything more complex than simply plotting frequency content (like pitch detection, or windowing (as others have suggested)), I'm afraid you are going to have learn what the maths means.

这篇关于一个作图声音的音调(频率)的文章就介绍到这了,希望我们推荐的答案对大家有所帮助,也希望大家多多支持IT屋!

查看全文
登录 关闭
扫码关注1秒登录
发送“验证码”获取 | 15天全站免登陆