如何将原生html5录制的音频的float32Array格式转换为Google语音到文本服务的正确字节？ [英] How to convert the float32Array format of native html5 recorded audio to proper bytes for Google Speech-to-Text service?

查看：576 发布时间：2019/6/8 21:59:26 javascript python websocket audio-streaming google-speech-api

本文介绍了如何将原生html5录制的音频的float32Array格式转换为Google语音到文本服务的正确字节？的处理方法，对大家解决问题具有一定的参考价值，需要的朋友们下面随着小编来一起学习吧！

问题描述

如果您遵循本教程： https://medium.com/ideas-at-igenius/delivering-a-smooth-cross-browser-speech-to-text-experience-b1e1f1f194a2 您将设法创建一个脚本处理器添加侦听器

If you follow this tutorial: https://medium.com/ideas-at-igenius/delivering-a-smooth-cross-browser-speech-to-text-experience-b1e1f1f194a2 you will manage to create a script processor to which you add a listener

scriptProcessor = inputPoint.context.createScriptProcessor(bufferSize, in_channels, out_channels)
//...
scriptProcessor.addEventListener('audioprocess', streamAudioData)

在回调中调用这一行： callback_param.inputBuffer.getChannelData（0）一个人收到一个javascript Float32Array，它通过查看数据似乎包含从-1.0到+1.0的浮点数

Inside the callback by calling this line: callback_param.inputBuffer.getChannelData(0) one receives a javascript Float32Array which by looking at the data seems to contain float numbers from -1.0 to +1.0

因此将此流式传输到后端，后端又将其流式传输到Google语音转文本服务，您无法获得任何结果（正如预期的那样）

Therefore streaming this to the backend which in turn streams it to Google Speech-To-Text service you are getting nothing (as expected)

Goo gle语音到文本服务，至少在Python中，用于流输入需要一个wav格式的字节串，其中包含指定速率的声音（即16000HZ）。请注意，如果在后端你流式传输一个文件，这工作正常。

Google Speech-To-Text service, at least in Python, for streaming input expects a byte-string in a wav format which contains the sound in the rate that it was specified (i.e. 16000Hz). Note that if in the backend you stream it a file this is working ok.

此转换失败：Float32Array - > Int16Array - > byte-string

This conversion has failed: Float32Array -> Int16Array -> byte-string

有没有人找到上述工作的适当转换？

Has anyone find what are the appropriate conversions for the above to work ?

或者你知道一个更简单，更健壮的路径：浏览器中的麦克风 - >通过websocket将数据流传输到后端服务器 - >将数据流传输到Google语音转输服务 - >按预期获得响应？

Alternatively are you aware of a simpler more robust path for: Microphone in browser -> stream data via websocket to backend server -> stream data to Google Speech-To-Input service -> get responses as expected ?

编辑：为Google Speech api的识别配置添加python代码

Adding python code for Recognition Config of Google speech api

config = types.RecognitionConfig(
        encoding=enums.RecognitionConfig.AudioEncoding.LINEAR16,
        sample_rate_hertz=16000,
        language_code=self.language_code)

如何将原生html5录制的音频的float32Array格式转换为Google语音到文本服务的正确字节？ [英] How to convert the float32Array format of native html5 recorded audio to proper bytes for Google Speech-to-Text service?

问题描述

推荐答案

相关文章

前端开发最新文章

热门教程

热门工具

登录关闭

如何将原生html5录制的音频的float32Array格式转换为Google语音到文本服务的正确字节？ [英] How to convert the float32Array format of native html5 recorded audio to proper bytes for Google Speech-to-Text service?

问题描述

推荐答案

相关文章

前端开发最新文章

热门教程

热门工具

登录 关闭

登录关闭