什么是使用pyaudio时的块，样本和帧 [英] What are chunks, samples and frames when using pyaudio

查看：513 发布时间：2020/9/13 21:15:30 python python-2.7 audio sampling pyaudio

本文介绍了什么是使用pyaudio时的块，样本和帧的处理方法，对大家解决问题具有一定的参考价值，需要的朋友们下面随着小编来一起学习吧！

问题描述

浏览pyaudio的文档并阅读网络上的其他文章后，如果我的理解是正确的，我会感到困惑.

After going through the documentation of pyaudio and reading some other articles on the web, I am confused if my understanding is correct.

这是在pyaudio网站上找到的音频录制代码:

This is the code for audio recording found on pyaudio's site:

import pyaudio
import wave

CHUNK = 1024
FORMAT = pyaudio.paInt16
CHANNELS = 2
RATE = 44100
RECORD_SECONDS = 5
WAVE_OUTPUT_FILENAME = "output.wav"

p = pyaudio.PyAudio()

stream = p.open(format=FORMAT,
                channels=CHANNELS,
                rate=RATE,
                input=True,
                frames_per_buffer=CHUNK)

print("* recording")

frames = []

for i in range(0, int(RATE / CHUNK * RECORD_SECONDS)):
    data = stream.read(CHUNK)
    frames.append(data)

print("* done recording")

stream.stop_stream()
stream.close()
p.terminate()

如果我添加这些行，那么我就能播放我录制的内容:

and if I add these lines then I am able to play whatever I recorded:

play=pyaudio.PyAudio()
stream_play=play.open(format=FORMAT,
                      channels=CHANNELS,
                      rate=RATE,
                      output=True)
for data in frames: 
    stream_play.write(data)
stream_play.stop_stream()
stream_play.close()
play.terminate()

"RATE"是每秒收集的样本数.
"CHUNK"是缓冲区中的帧数.
每个帧将有2个样本作为"CHANNELS = 2".
每个样本的大小为2个字节，使用函数pyaudio.get_sample_size(pyaudio.paInt16)计算.
因此，每帧的大小为4个字节.
在帧"列表中，每个元素的大小必须为1024 * 4字节，例如，frames[0]的大小必须为4096字节.然而， sys.getsizeof(frames[0])返回4133，但是len(frames[0])返回4096.
for循环执行int(RATE / CHUNK * RECORD_SECONDS)次，我不明白为什么. 此处是鲁本·桑切斯(Ruben Sanchez)"回答了相同的问题，但我不能确定它是否如他所说的CHUNK=bytes正确.并且根据他的解释，该值必须为int(RATE / (CHUNK*2) * RECORD_SECONDS)，因为(CHUNK*2)是每次迭代在缓冲区中读取的样本数.
最后，当我编写print frames[0]时，它会打印乱码，因为它试图将字符串视为不是ASCII编码的字符串，而只是字节流.那么，如何使用struct模块以十六进制打印此字节流?如果以后再用自己选择的值更改每个十六进制值，它还会产生可播放的声音吗?

"RATE" is the number of samples collected per second.
"CHUNK" is the number of frames in the buffer.
Each frame will have 2 samples as "CHANNELS=2".
Size of each sample is 2 bytes, calculated using the function: pyaudio.get_sample_size(pyaudio.paInt16).
Therefore size of each frame is 4 bytes.
In the "frames" list, size of each element must be 1024*4 bytes, for example, size of frames[0] must be 4096 bytes. However, sys.getsizeof(frames[0]) returns 4133, but len(frames[0]) returns 4096.
for loop executes int(RATE / CHUNK * RECORD_SECONDS) times, I cant understand why. Here is the same question answered by "Ruben Sanchez" but I cant be sure if its correct as he says CHUNK=bytes. And according to his explanation, it must be int(RATE / (CHUNK*2) * RECORD_SECONDS) as (CHUNK*2) is the number of samples read in buffer with each iteration.
Finally when I write print frames[0], it prints gibberish as it tries to treat the string to be ASCII encoded which it is not, it is just a stream of bytes. So how do I print this stream of bytes in hexadecimal using struct module? And if later, I change each of the hexadecimal value with values of my choice, will it still produce a playable sound?

我上面写的都是我对事物的理解，其中许多可能是错误的.

Whatever I wrote above was my understanding of the things and many of them maybe wrong.

什么是使用pyaudio时的块，样本和帧 [英] What are chunks, samples and frames when using pyaudio

问题描述

推荐答案

相关文章

Python最新文章

热门教程

热门工具

登录关闭

什么是使用pyaudio时的块，样本和帧 [英] What are chunks, samples and frames when using pyaudio

问题描述

推荐答案

相关文章

Python最新文章

热门教程

热门工具

登录 关闭

登录关闭