正火的音频算法的Java [英] Java algorithm for normalizing audio

查看:203
本文介绍了正火的音频算法的Java的处理方法,对大家解决问题具有一定的参考价值,需要的朋友们下面随着小编来一起学习吧!

问题描述

我试图正常化讲话的音频文件。

I'm trying to normalize an audio file of speech.

具体而言,其中的音频文件包含在成交量的山峰,我想平出来,这样安静的部分是响亮,和峰安静。

Specifically, where an audio file contains peaks in volume, I'm trying to level it out, so the quiet sections are louder, and the peaks are quieter.

我还不是很了解音频处理,超出了我从这项任务的工作经验。此外,我的数学是令人尴尬的薄弱。

I know very little about audio manipulation, beyond what I've learnt from working on this task. Also, my math is embarrassingly weak.

我做了一些研究,并在现场Xuggle提供了一个示例,显示使用以下code减小体积:(<一href=\"http://build.xuggle.com/view/Stable/job/xuggler_jdk5_stable/ws/workingcopy/src/com/xuggle/mediatool/demos/ModifyAudioAndVideo.java\">full版本这里)

I've done some research, and the Xuggle site provides a sample which shows reducing the volume using the following code: (full version here)

@Override
  public void onAudioSamples(IAudioSamplesEvent event)
{
  // get the raw audio byes and adjust it's value 

  ShortBuffer buffer = event.getAudioSamples().getByteBuffer().asShortBuffer();
  for (int i = 0; i < buffer.limit(); ++i)
    buffer.put(i, (short)(buffer.get(i) * mVolume));

  super.onAudioSamples(event);
}

在这里,他们修改 getAudioSamples()按恒定 mVolume

大厦这种方法,我已经尝试正常化修改字节 getAudioSamples()来的标准值,考虑到最大/最小的文件中。 (见下文)。我有一个简单的过滤器离开沉默单独(即,低于任何有价值的东西)。

Building on this approach, I've attempted a normalisation modifies the bytes in getAudioSamples() to a normalised value, considering the max/min in the file. (See below for details). I have a simple filter to leave "silence" alone (ie., anything below a value).

我发现输出文件的非常问题(例如,质量严重退化)。我认为错误的是无论是在我的规范化algorithim,还是我的方式操纵字节。不过,我不确定下一步去哪里。

I'm finding that the output file is very noisy (ie., the quality is seriously degraded). I assume that the error is either in my normalisation algorithim, or the way I manipulate the bytes. However, I'm unsure of where to go next.

下面是目前我在做什么的删节版。

Here's an abridged version of what I'm currently doing.

读取完整的音频文件,并找到 buffer.get这个最高值和最低值()所有AudioSamples

Reads the full audio file, and finds this highest and lowest values of buffer.get() for all AudioSamples

    @Override
    public void onAudioSamples(IAudioSamplesEvent event) {
        IAudioSamples audioSamples = event.getAudioSamples();
        ShortBuffer buffer = 
           audioSamples.getByteBuffer().asShortBuffer();

        short min = Short.MAX_VALUE;
        short max = Short.MIN_VALUE;
        for (int i = 0; i < buffer.limit(); ++i) {
            short value = buffer.get(i);
            min = (short) Math.min(min, value);
            max = (short) Math.max(max, value);
        }
        // assign of min/max ommitted for brevity.
        super.onAudioSamples(event);

    }

步骤2:规范化所有值:

在类似于第一步一个循环,更换标准值缓冲,美其名曰:

Step 2: Normalize all values:

In a loop similar to step1, replace the buffer with normalized values, calling:

    buffer.put(i, normalize(buffer.get(i));

public short normalize(short value) {
    if (isBackgroundNoise(value))
        return value;

    short rawMin = // min from step1
    short rawMax = // max from step1
    short targetRangeMin = 1000;
    short targetRangeMax = 8000;

    int abs = Math.abs(value);
    double a = (abs - rawMin) * (targetRangeMax - targetRangeMin);
    double b = (rawMax - rawMin);
    double result = targetRangeMin + ( a/b );

     // Copy the sign of value to result.
    result = Math.copySign(result,value);
    return (short) result;
}

问题:


  • 这是企图正常化的音频文件的有效方法?

  • 是我在数学正常化()是否有效?

  • 为什么会变成这样的原因文件变得嘈杂,其中演示code类似的方法不?

  • 推荐答案

    我不认为最低采样值的概念,是非常有意义的,因为采样值只需重新presents当前高度声波在一定的时间瞬间。即其绝对值将音频剪辑和零的峰值之间变化。因此,具有 targetRangeMin 似乎是错误的,可能会导致波形有些失真

    I don't think the concept of "minimum sample value" is very meaningful, since the sample value just represents the current "height" of the sound wave at a certain time instant. I.e. its absolute value will vary between the peak value of the audio clip and zero. Thus, having a targetRangeMin seems to be wrong and will probably cause some distortion of the waveform.

    我想一个更好的方法可能是有某种的降低根据其大小的采样值权重函数。即更大的值是通过大比例小于值下降。这也将引入一些失真,但可能不会很明显。

    I think a better approach might be to have some sort of weight function that decreases the sample value based on its size. I.e. bigger values are decreased by a large percentage than smaller values. This would also introduce some distortion, but probably not very noticeable.

    编辑:这里是这种方法的实现:

    here is a sample implementation of such a method:

    public short normalize(short value) {
        short rawMax = // max from step1
        short targetMax = 8000;
    
        //This is the maximum volume reduction
        double maxReduce = 1 - targetMax/(double)rawMax;
    
        int abs = Math.abs(value);
        double factor = (maxReduce * abs/(double)rawMax);
    
        return (short) Math.round((1 - factor) * value); 
    }
    

    作为参考,这是你的算法做了一个正弦曲线与10000振幅:

    For reference, this is what your algorithm did to a sine curve with an amplitude of 10000:

    这解释了为什么音频质量达到的正常化后更糟糕。

    This explains why the audio quality becomes much worse after being normalized.

    这是我建议的正常化方法运行后的结果:

    This is the result after running with my suggested normalize method:

    这篇关于正火的音频算法的Java的文章就介绍到这了,希望我们推荐的答案对大家有所帮助,也希望大家多多支持IT屋!

查看全文
登录 关闭
扫码关注1秒登录
发送“验证码”获取 | 15天全站免登陆