如何从我的 wav 文件中提取声音数据? [英] How can I draw sound data from my wav file?

查看:41
本文介绍了如何从我的 wav 文件中提取声音数据?的处理方法,对大家解决问题具有一定的参考价值,需要的朋友们下面随着小编来一起学习吧!

问题描述

首先这是用于家庭作业或...项目.

First off this is for homework or... project.

我无法理解如何在 Java 中为项目将声音数据波绘制到图形上的想法.我必须完全从头开始使用 UI 和所有内容来完成这项任务,因此基本上是制作一个 .wav 文件编辑器.我遇到的主要问题是将声音数据放入要绘制的图形中.目前我正在绘制一个随机生成的值数组.

I'm having trouble understanding the idea behind how to draw the sound data waves on to a graph in Java for a project. I have to make this assignment entirely from scratch with a UI and everything so basically making a .wav file editor. The main issue I'm having is getting the sound data into the graph to be drawn. Currently I have a randomly generated array of values just being drawn right now.

到目前为止,我有一个小程序正在运行并验证 wav 文件是否真的是一个 wav 文件.

So far I have a mini-program running and validating the wav file for it to actually be a wav file.

我正在使用 FileInputStream 读取它并验证:RIFF 字节(0-3)、FileLength(4-7)、WAVE 字节(8-11),然后是格式块格式(从末尾开始)RIFF 块;并将索引定位到它的末尾并给出格式 0-3,格式块的长度 4-7,然后是接下来的 16 个字节,用于波形文件的所有规范,并将它们存储在适当的命名变量中.

I'm reading it in with a FileInputStream and validating: the RIFF bytes(0-3), FileLength(4-7), WAVE bytes(8-11), then the format chunk format(starting from the end of the RIFF chunk; and positioning the index to the end of it and giving format 0-3, length of format chunk 4-7, then the next 16 bytes for all the specifications of the wave file and storing those in their appropriate named variables.

一旦我到达 DATA 块及其长度,这就是我所有的声音数据,这就是我不确定如何存储每个字节的声音数据,甚至将其转换为与声音的振幅.我认为验证是相似的,所以它会是相同的,但它似乎不是那样的......要么是那样,要么我一直在把一些非常简单的事情复杂化,因为我已经盯着这个看了几天了.

Once I get to the DATA chunk and its length past that is all my sound data and that is what I'm unsure of how to store each byte for byte of sound data or even translate it to be value that's related to the amplitude of the sound. I thought validating was similar so it would be the same but it doesn't seem to be that way... Either that or I've been complicating something super simple since I've been staring at this for a few days now.

感谢任何帮助.

推荐答案

我不是 Java 程序员,但我对渲染音频有一定的了解,所以希望以下内容可能对您有所帮助...

I'm not a Java programmer, but I know a fair bit about rendering audio so hopefully the following might be of some help...

鉴于您几乎总是拥有比可用像素多得多的样本,明智的做法是从样本数据的缓存缩减或摘要"中提取.这通常是音频编辑器(例如 Audacity)呈现音频数据的方式.事实上,最常见的策略是计算每个像素的样本数,然后找到每个 SamplesPerPixel 大小的块的最大和最小样本,然后在每个最大-最小对之间画一条垂直线.您可能想要缓存这种缩小,或者可能是针对不同缩放级别的一系列此类缩小.Audacity 缓存到磁盘上的临时文件(块文件").

Given that you will almost always have a much larger number of samples than available pixels the sensible thing to do would be to draw from a cached reduction or 'summary' of the sample data. This is typically how audio editors (such as Audacity) render audio data. In fact the most common strategy is to compute the number of samples per pixel, then find the maximum and minimum samples for each block of size SamplesPerPixel, then draw a vertical line between each max-min pair. You might want to cache this reduction, or perhaps a series of such reductions for different zoom levels. Audacity caches to temporary files ('block files') on disk.

以上可能有些过于简单化了,因为实际上您希望从固定大小的块(例如 256 个样本)中计算初始最大-最小对,而不是从大小中的一个 SamplesPerPixel.然后你可以从缓存的减少中计算进一步的动态"减少.关键是 SamplesPerPixel 通常是一个动态量 - 因为用户可以随时调整画布大小(希望有意义......).

The above is perhaps something of an oversimplification, however, because in reality you will want to compute the initial max-min pairs from a chunk of fixed size - say 256 samples - rather than from one of size SamplesPerPixel. Then you can compute further 'on the fly' reductions from that cached reduction. The point is that SamplesPerPixel will typically be a dynamic quantity - since the user might resize the canvas at any time (hope that makes sense...).

另请记住,当您在画布上绘图时,您需要按画布的宽度和高度缩放示例值.最好的方法(至少在垂直方向)是对样本进行归一化,然后乘以画布高度.16 位音频由 [-32768, 32767] 范围内的样本组成,因此要标准化,只需将浮点除以 32768.然后反转符号(将波形翻转到画布坐标),加 1(以补偿对于负值)并乘以一半画布高度.反正我就是这样做的.

Also remember that when you are drawing to your canvas you will need to scale the sample values by the width and height of the canvas. The best way to do this (in the vertical direction, at least) is to normalize the samples, then multiply by the canvas height. 16-bit audio consists of samples in the range [-32768, 32767], so to normalize just do a floating-point division by 32768. Then reverse the sign (to flip the waveform to the canvas coordinates), add 1 (to compensate for the negative values) and multiply by half the canvas height. That's how I do it, anyway.

页面展示了如何使用 Java 构建基本的波形显示摇摆.我没有详细研究过它,但我认为它只是对数据进行了下采样,而不是计算最大-最小对.当然,这不会提供像 max-min 方法那样准确的减少量,但它更容易计算.

This page shows how to build a rudimentary waveform display with Java Swing. I haven't looked at it in detail, but I think it just downsamples the data rather than computing max-min pairs. This will, of course, not provide as accurate a reduction as the max-min method, but it's easier to calculate.

如果你想知道如何正确地做事,你应该深入研究 Audacity 源代码(但是要注意 - 它是相当粗糙的 C++).要获得一般概述,您可以查看 '基于磁盘的音频编辑的快速数据结构',由 Audacity 的原作者 Dominic Mazzoni 撰写.但是,您需要从 CMJ 购买.

If you want to know how to do things properly you should dig into the Audacity source code (be warned, however - it's fairly gnarly C++). To get a general overview you might look at 'A Fast Data Structure for Disk-Based Audio Editing', by the original author of Audacity, Dominic Mazzoni. You will need to purchase that from CMJ, however.

这篇关于如何从我的 wav 文件中提取声音数据?的文章就介绍到这了,希望我们推荐的答案对大家有所帮助,也希望大家多多支持IT屋!

查看全文
登录 关闭
扫码关注1秒登录
发送“验证码”获取 | 15天全站免登陆