绘制声音的音高(频率) [英] Graphing the pitch (frequency) of a sound

查看:34
本文介绍了绘制声音的音高(频率)的处理方法,对大家解决问题具有一定的参考价值,需要的朋友们下面随着小编来一起学习吧!

问题描述

我想将声音的音高绘制成图表.

I want to plot the pitch of a sound into a graph.

目前我可以绘制振幅.下图由getUnscaledAmplitude()返回的数据创建:

Currently I can plot the amplitude. The graph below is created by the data returned by getUnscaledAmplitude():

AudioInputStream audioInputStream = AudioSystem.getAudioInputStream(new BufferedInputStream(new FileInputStream(file)));
byte[] bytes = new byte[(int) (audioInputStream.getFrameLength()) * (audioInputStream.getFormat().getFrameSize())];
audioInputStream.read(bytes);

// Get amplitude values for each audio channel in an array.
graphData = type.getUnscaledAmplitude(bytes, 1);


public int[][] getUnscaledAmplitude(byte[] eightBitByteArray, int nbChannels)
{
    int[][] toReturn = new int[nbChannels][eightBitByteArray.length / (2 * nbChannels)];
    int index = 0;

    for (int audioByte = 0; audioByte < eightBitByteArray.length;)
    {
        for (int channel = 0; channel < nbChannels; channel++)
        {
            // Do the byte to sample conversion.
            int low = (int) eightBitByteArray[audioByte];
            audioByte++;
            int high = (int) eightBitByteArray[audioByte];
            audioByte++;
            int sample = (high << 8) + (low & 0x00ff);

            toReturn[channel][index] = sample;
        }
        index++;
    }

    return toReturn;
}

但我需要显示音频的音高,而不是幅度.Fast_Fourier_transform">Fast_Fourier_transform">快速傅立叶变换 似乎可以得到音调,但它需要知道比我拥有的原始字节更多的变量,并且非常复杂和数学.

But I need to show the audio's pitch, not amplitude. Fast Fourier transform appears to get the pitch, but it needs to know more variables than the raw bytes I have, and is very complex and mathematical.

有没有办法做到这一点?

Is there a way I can do this?

推荐答案

Frequency(客观指标)与 pitch(主观数量)不同.一般来说,音高检测是一个非常棘手的问题.

Frequency (an objective metric) is not the same as pitch (a subjective quantity). In general, pitch detection is a very tricky problem.

假设您现在只想绘制频率响应图,您别无选择,只能使用 FFT,因为它是获取时域数据频率响应的THE 方法.(嗯,还有其他方法,例如离散余弦变换,但它们同样难以实现,而且更难以解释.

Assuming you just want to graph the frequency response for now, you have little choice but to use the FFT, as it is THE method to obtain the frequency response of time-domain data. (Well, there are other methods, such as the discrete cosine transform, but they're just as tricky to implement, and more tricky to interpret).

如果您正在为 FFT 的实现而苦苦挣扎,请注意它实际上只是一种用于计算离散傅立叶变换 (DFT) 的有效算法;请参阅http://en.wikipedia.org/wiki/Discrete_Fourier_transform.基本的 DFT 算法要​​简单得多(只有两个嵌套循环),但运行速度要慢lot(O(N^2) 而不是 O(N log N)).

If you're struggling with the implementation of the FFT, note that it's really just an efficient algorithm for calculating the discrete Fourier transform (DFT); see http://en.wikipedia.org/wiki/Discrete_Fourier_transform. The basic DFT algorithm is much easier (just two nested loops), but runs a lot slower (O(N^2) rather than O(N log N)).

如果您想做比简单地绘制频率内容更复杂的事情(例如音高检测或窗口(如其他人建议的那样)),恐怕您将了解数学的含义.

If you wish to do anything more complex than simply plotting frequency content (like pitch detection, or windowing (as others have suggested)), I'm afraid you are going to have learn what the maths means.

这篇关于绘制声音的音高(频率)的文章就介绍到这了,希望我们推荐的答案对大家有所帮助,也希望大家多多支持IT屋!

查看全文
登录 关闭
扫码关注1秒登录
发送“验证码”获取 | 15天全站免登陆