AVAudioRecorder没有写出正确的WAV文件头 [英] AVAudioRecorder doesn't write out proper WAV File Header

查看:860
本文介绍了AVAudioRecorder没有写出正确的WAV文件头的处理方法,对大家解决问题具有一定的参考价值,需要的朋友们下面随着小编来一起学习吧!

问题描述

我正在研究iPhone上的一个项目,我正在使用AVAudioRecorder从设备麦克风录制音频,然后将操作录音。

I'm working on a project on the iPhone where I'm recording audio from the device mic using AVAudioRecorder, and then will be manipulating the recording.

To确保我正确地读取文件中的样本,我正在使用python的波形模块来查看它是否返回相同的样本。

To ensure that I'm reading in the samples from the file correctly, I'm using python's wave module to see if it returns the same samples.

但是,python的wave模块尝试打开由AVAudioRecorder保存的wav文件时返回fmt chunk和/或data chunk missing。

However, python's wave module returns "fmt chunk and/or data chunk missing" when trying to open the wav file that is saved by AVAudioRecorder.

这些是我用来记录文件的设置:

These are the settings I am using to record the file:

[audioSettings setObject:[NSNumber numberWithInt:kAudioFormatLinearPCM] forKey:AVFormatIDKey];
[audioSettings setObject:[NSNumber numberWithInt:16] forKey:AVLinearPCMBitDepthKey];
[audioSettings setObject:[NSNumber numberWithBool:NO] forKey:AVLinearPCMIsBigEndianKey];
[audioSettings setObject:[NSNumber numberWithFloat:4096] forKey:AVSampleRateKey];
[audioSettings setObject:[NSNumber numberWithInt:1] forKey:AVNumberOfChannelsKey];
[audioSettings setObject:[NSNumber numberWithBool:YES] forKey:AVLinearPCMIsNonInterleaved];
[audioSettings setObject:[NSNumber numberWithBool:NO] forKey:AVLinearPCMIsFloatKey]; 

之后,我只是打电话给recordForDuration来实际录音。

After that, I'm just making a call to recordForDuration to actually do the recording.

录音成功 - 我可以播放文件等,我可以使用AudioFile服务读取示例,但我无法验证它,因为我不能使用Python的波形模块打开文件。

The recording succeeds-- I can play the file etc, and I can read in the samples using AudioFile services, but I can't validate it because I can't open the file with Python's wave module.

这是文件的前128个字节的样子:

This is what the first 128 bytes of the file look like:

1215N:~/Downloads$ od -c --read-bytes 128 testFile.wav
0000000   R   I   F   F   x   H 001  \0   W   A   V   E   f   m   t    
0000020 020  \0  \0  \0 001  \0 001  \0   @ 037  \0  \0 200   >  \0  \0
0000040 002  \0 020  \0   F   L   L   R 314 017  \0  \0  \0  \0  \0  \0
0000060  \0  \0  \0  \0  \0  \0  \0  \0  \0  \0  \0  \0  \0  \0  \0  \0
*
0000200

知道我需要做些什么才能确保写出正确的WAV标题AVAudioRecorder?

Any idea what I need to do to make sure a correct WAV header is written out by AVAudioRecorder?

推荐答案

Apple软件经常使用非标准(但符合规格)创建WAVE文件FLLR fmt子块之后以及数据之前的子块子块。我假设FLLR代表填充,我假设子块的目的是启用某种数据对齐优化。子块通常长约4000字节,但其实际长度可能会根据前面的数据长度而变化。

Apple software often creates WAVE files with a non-standard (but "spec" conformant) "FLLR" subchunk after the "fmt " subchunk and before the "data" subchunk. I assume "FLLR" stands for "filler", and I assume the purpose of the subchunk is to enable some sort of data alignment optimization. The subchunk is usually about 4000 bytes long, but its actual length can vary depending on the length of the data preceding it.

将任意子块添加到WAVE文件通常被认为是规范-conformant因为WAVE是 RIFF 的一个子集,RIFF文件处理中的常见做法是忽略具有无法识别标识符的块和子块。标识符FLLR是非标准,因此任何遇到它的软件都应该忽略它。

Adding arbitrary subchunks to WAVE files is generally considered spec-conformant because WAVE is a subset of RIFF, and the common practice in RIFF file processing is to ignore chunks and subchunks which have an unrecognized identifier. The identifier "FLLR" is "non-standard" and so should be ignored by any software which encounters it.

有相当数量的软件可以更加严格地处理WAVE格式,我怀疑你使用的库可能是其中一个软件。例如,我见过软件假定音频字节总是从偏移44开始 - 这是一个不正确的假设。

There is a fair amount of software out there that treats the WAVE format much more rigidly than it ought to, and I suspect the library you're using may be one of those pieces of software. For example, I have seen software that assumes that the audio bytes always begin at offset 44 -- this is an incorrect assumption.

实际上,找到WAVE文件中的音频字节必须通过查找data的位置和大小来完成 RIFF内的子块;这是在WAVE文件中定位音频字节的正确方法。

In fact, finding the audio bytes in a WAVE file must be done by finding the location and size of the "data" subchunk within the RIFF; this is the correct way to locate the audio bytes within a WAVE file.

正确读取WAVE文件必须真正开始作为查找和识别RIFF子块的练习。 RIFF子块有一个8字节的头:4个字节用于标识符/名称字段,传统上填充了人类可读的ASCII字符(例如fmt),以及4 -byte little-endian无符号整数,指定子块数据有效负载中的字节数 - 子块的数据有效负载紧跟在其8字节头之后。

Reading WAVE files properly must really begin as an exercise in locating and identifying RIFF subchunks. RIFF subchunks have an 8-byte header: 4 bytes for an identifier/name field which is traditionally filled with human-readable ASCII characters (e.g. "fmt "), and a 4-byte little-endian unsigned integer specifying the number of bytes in the subchunk's data payload -- the subchunk's data payload follows immediately after its 8-byte header.

WAVE文件format保留某些子块标识符(或名称)对WAVE格式有意义。每个WAVE文件中必须至少出现两个子块:

The WAVE file format reserves certain subchunk identifiers (or "names") as being meaningful to the WAVE format. There are a minimum of two subchunks that must always appear in every WAVE file:


  1. fmt - 具有此标识符的子块有一个有效载荷,用于描述有关音频格式的基本信息:采样率,位深度等。

  2. data - 具有此标识符的子块在其有效负载中具有实际音频字节

  1. "fmt " - the subchunk with this identifier has a payload which describes the basic information about the audio's format: sample rate, bit depth, etc.
  2. "data" - the subchunk with this identifier has the actual audio bytes in its payload

fact是下一个最常见的子块标识符。它通常在使用压缩编解码器的WAVE文件中找到,例如μ-law。请参阅此发烧友网页有关当前在野外使用的各种子块标识符的更多信息,以及有关其有效负载结构的信息。

"fact" is the next most common subchunk identifier. It is usually found in WAVE files that use a compressed codec, such as μ-law. See this enthusiast webpage for more information about some of the various subchunk identifiers in use today in the wild, and information about their payload structure.

从纯粹的RIFF角度来看,子块不需要出现在任何子块中。文件中的特定顺序,或任何特定的固定偏移量。然而,在实践中,几乎所有软件都希望fmt子块成为第一个子块。这是对实用性的让步:在数据流的早期知道WAVE包含什么格式的音频是方便的 - 例如,这使得从网络流播放波形文件变得更容易。如果WAVE文件使用压缩格式,例如μ-law,通常会假设fact子块将直接出现在fmt之后

From a purely RIFF perspective, subchunks need not appear in any particular order in the file, or at any particular fixed offset. In practice however, almost all software expects the "fmt " subchunk to be the first subchunk. This is a concession to practicality: it is convenient to know early in the data stream what format of audio the WAVE contains -- this makes it easier to play a wave file from a network stream, for example. If the WAVE file uses a compressed format, such as μ-law, it is usually assumed that the "fact" subchunk will appear directly after "fmt ".

在格式指定块不在之后,应放弃关于子块的位置,排序和命名的假设。此时,软件应仅按名称找到预期的子块(例如data)。如果遇到具有无法识别名称的子块(例如FLLR),则应简单地跳过并忽略这些子块。跳过子块需要读取其长度,以便跳过正确的字节数。

After the format-specifying chunks are out of the way, assumptions about the location, ordering, and naming of subchunks should be abandoned. At this point, the software should locate expected subchunks by name only (e.g. "data"). If subchunks are encountered that have unrecognized names (e.g. "FLLR"), those subchunks should simply be skipped over and ignored. Skipping a subchunk requires reading its length so that you can skip over the correct number of bytes.

Apple使用FLLR做了什么 subchunk有点不寻常,我并不惊讶某些软件被它绊倒了。我怀疑你使用的库只是没有准备好处理FLLR子块的存在。我认为这是库中的缺陷。图书馆作者犯的错误可能类似于:

What Apple has done with the "FLLR" subchunk is slightly unusual, and I'm not surprised that some software is tripped up by it. I suspect that the library you are using is simply unprepared to deal with the presence of the "FLLR" subchunk. I would consider this a defect in the library. The mistake the library authors have made is probably something like:


  1. 他们可能期待数据子块出现在文件开头的前N个字节内,其中N小于~4kB。他们可能会放弃查看是否必须扫描到文件太远。 Apple FLLR子块将data子块推送到文件中的位置> ~4kB。

  1. They may be expecting the "data" subchunk to appear within the first N bytes of the beginning of the file, where N is something less than ~4kB. They may give up looking if they have to scan too far into the file. The Apple "FLLR" subchunk pushes the "data" subchunk to a position >~4kB into the file.

他们可能期望data子块在子块中具有特定的序数子块位置或字节偏移量。 RIFF。也许他们希望在fmt之后立即出现data。但这是处理RIFF文件的错误方法。不应假设data子块的序号位置和/或偏移位置。

They may be expecting the "data" subchunk to have a specific ordinal subchunk position or byte offset within the RIFF. Perhaps they expect "data" to appear immediately after "fmt ". This is an incorrect way to process a RIFF file, though. The ordinal position and/or offset position of the "data" subchunk should not be assumed.

只要我们谈论正确的WAVE文件处理,我不妨提醒大家音频字节( data 子块的有效载荷)可能无法完全运行到文件的末尾。允许在 数据有效负载之后插入子块。有些程序使用它来在文件末尾存储文本注释字段。如果你从数据有效载荷的开头盲目读取直到EOF,你可以将一些元数据子块作为音频引入,这听起来像是一个点击的结尾回放。您需要遵守 data 子块的长度字段,并在消耗完整个数据有效负载后停止读取音频 - 而不是在您点击EOF时停止。

As long as we're talking about correct WAVE file processing, I might as well remind everyone that the audio bytes (the data subchunk's payload) may not run exactly to the end of the file. It is allowable to insert subchunks after the data payload. Some programs use this to store a textual "comment" field at the end of the file. If you read blindly from the start of the data payload until the EOF, you may pull in some metadata subchunks as audio, which will sounds like a "click" at the end of playback. You need to honor the length field of the data subchunk and stop reading audio once you've consumed the entire data payload -- not stop when you hit EOF.

这篇关于AVAudioRecorder没有写出正确的WAV文件头的文章就介绍到这了,希望我们推荐的答案对大家有所帮助,也希望大家多多支持IT屋!

查看全文
登录 关闭
扫码关注1秒登录
发送“验证码”获取 | 15天全站免登陆