分割Ogg Opus文件流 [英] Splitting an Ogg Opus File stream

查看:90
本文介绍了分割Ogg Opus文件流的处理方法,对大家解决问题具有一定的参考价值,需要的朋友们下面随着小编来一起学习吧!

问题描述

我正在尝试将OGG_OPUS编码的流发送到Google语音到文本流服务.由于Google对他们的流请求有一个时间限制,因此我必须以固定的时间间隔将音频流路由到另一个Google Speech To Text流会话.

I am trying to send an OGG_OPUS encoded stream to google's speech to text streaming service. Since there is a time limit imposed by Google for their stream requests, I have to route the audio stream to another Google Speech To Text streaming session on a fixed interval.

根据我所阅读的内容,OGG流中的页面无法独立读取,因为页面中的数据是通过考虑上一页和下一页的数据来计算的.如果是这样,我们是否可以在某个时间点切断流并用剩余的数据重新创建一个全新的流?停止在某个点并在新的流中发送数据是行不通的,因为最初的OGG标头数据包在第二个流中也不可用.

From what I've read, the pages in the OGG stream cannot be read independently since the data in the pages are calculated by considering the data of the previous and next pages. If that is the case, can we cut off the stream at a certain point and recreate a brand new stream with the remaining data? Stopping at a certain point and sending the data in a new stream just doesn't work because the initial OGG header packets are also no available in the second stream.

我知道可以使用PCM数据解决此问题,因为未对其进行编码,因此PCM流可以随时随地拆分并转换为新的流.由于比特率高,我无法使用PCM流,而且我不愿使用无损质量,因为我正在传输语音数据流.

I know that this issue can be solved using PCM data, since its not encoded, a PCM stream can simply be split at any point and turned into a new stream. I cannot use a PCM stream due to the heavy bitrate, also I prefer not to use lossless quality since I'm transferring a voice data stream.

引用: https://tools.ietf.org/html/rfc7845#section-3

推荐答案

OpusFileSplitter 可以拆分Opus音频文件.

OpusFileSplitter can split Opus audio files.

Ogg页面可以独立读取,只要文件以Streaming of Stream(BOS)标头和注释页面开头即可.您可以通过创建新文件来将一个Ogg文件拆分为多个文件,这些文件以Ogg标头页开头,之后是Ogg数据/音频页.例如,此Ogg Opus文件:

The Ogg pages can be read independently as long as the file starts with the Beginning of Stream (BOS) header and comment page. You can split one Ogg file into multiple files by creating new files that start with the Ogg header page and have Ogg data/audio pages after . For example, this Ogg Opus file:

*********************************************************
*          *              *              *              *
*  Header  *  Audio Data  *  Audio Data  *  Audio Data  *
*   Page   *    Page 1    *    Page 2    *    Page 3    *
*          *              *              *              *
*********************************************************

可以分为2个文件:

***************************
*          *              *
*  Header  *  Audio Data  *
*   Page   *    Page 1    *
*          *              *
***************************

******************************************
*          *              *              *
*  Header  *  Audio Data  *  Audio Data  *
*   Page   *    Page 2    *    Page 3    *
*          *              *              *
******************************************

关于可以分割并跨越多个页面的音频片段,您是正确的.我假设如果页面包含不完整的音频片段,则可能会损失几毫秒,但这不会打扰语音识别.不幸的是,我的本地测试使用了由 opusenc util生成的Opus文件,该文件没有创建将段分割为多个页面的页面,这似乎是分割文件的好东西!

You're correct regarding audio segments that could be split and span across multiple pages. I'm assuming that a few milliseconds could be lost if a page contains incomplete audio segments, but that should not disrupt speech recognition. Unfortunately, my local tests used Opus files generated by opusenc util, which didn't create pages that split segments across pages, which seems to be a good thing for splitting files!

OpusFileSplitter.scanPages() 显示如何查找页面边界.

OpusFileSplitter.scanPages() shows how to find the page boundaries.

这篇关于分割Ogg Opus文件流的文章就介绍到这了,希望我们推荐的答案对大家有所帮助,也希望大家多多支持IT屋!

查看全文
登录 关闭
扫码关注1秒登录
发送“验证码”获取 | 15天全站免登陆