IBM Watson 语音到文本 API 中的 1006 错误代码 [英] 1006 Error code in IBM Watson speech-to-text API

查看:10
本文介绍了IBM Watson 语音到文本 API 中的 1006 错误代码的处理方法,对大家解决问题具有一定的参考价值,需要的朋友们下面随着小编来一起学习吧!

问题描述

我正在使用 Ratchet 连接到 IBM Watson websockets,并且 对于较小的文件似乎总是可以正常工作(我已经测试了长达 66 分钟的 23 MB mp3 文件),但它总是失败更大的文件(例如 2 小时 56 MB mp3).

I'm using Ratchet to connect to IBM Watson websockets, and it always seems to work fine for smaller files (I've tested up to a 66-minute 23 MB mp3 file), but it always fails for larger files (such as 2-hour 56 MB mp3).

这是我的日志:

[2019-03-17 21:43:23] local.DEBUG: RatchetClientconnect bf4e38983775f6e53b392666138b5a3a50e9c9c8  
[2019-03-17 21:43:24] local.DEBUG: startWatsonStream options = {"content-type":"audio/mpeg","timestamps":true,"speaker_labels":true,"smart_formatting":true,"inactivity_timeout":-1,"interim_results":false,"max_alternatives":1,"word_confidence":false,"action":"start"}  
[2019-03-17 21:43:24] local.DEBUG: Split audio into this many frames: 570222  
[2019-03-17 21:43:42] local.DEBUG: send action stop  
[2019-03-17 21:43:42] local.DEBUG: Received: {
   "state": "listening"
}  
[2019-03-17 21:43:42] local.DEBUG: Received first 'listening' message.  
[2019-03-17 22:56:31] local.DEBUG: Connection closed (1006 - Underlying connection closed)  

注意在收到第一个侦听"消息和连接因错误而关闭之间的 1h13m.

Notice the 1h13m between receiving the first 'listening' message and then having the connection close with an error.

Watson 说:1006 表示连接异常关闭."

Watson says: "1006 indicates that the connection closed abnormally."

https://www.rfc-editor.org/rfc/rfc6455说:

1006 是一个保留值,并且不能被端点设置为关闭控制帧中的状态代码.它被指定用于需要状态代码来指示连接异常关闭的应用程序,例如,没有发送或接收关闭控制帧.

1006 is a reserved value and MUST NOT be set as a status code in a Close control frame by an endpoint. It is designated for use in applications expecting a status code to indicate that the connection was closed abnormally, e.g., without sending or receiving a Close control frame.

我可以调整代码的哪一部分,使其可以处理更长的 mp3 文件而不会引发 1006 错误?

RatchetClientconnect($url, [], $headers)->then(function(RatchetClientWebSocket $conn) use($contentType, $audioFileContents, $callback) {
    $conn->on('message', function($msg) use ($conn, $callback) {
        $this->handleIncomingWebSocketMessage($msg, $conn, $callback);
    });
    $conn->on('close', function($code = null, $reason = null) {
        Log::debug("Connection closed ({$code} - {$reason})");
    });
    $this->startWatsonStream($conn, $contentType);
    $this->sendBinaryMessage($conn, $audioFileContents); 
    Log::debug('send action stop');
    $conn->send(json_encode(['action' => 'stop']));
}, function (Exception $e) {
    Log::error("Could not connect: {$e->getMessage()} " . $e->getTraceAsString());
});

...

public function handleIncomingWebSocketMessage($msg, $conn, $callback) {
    Log::debug("Received: " . str_limit($msg, 100));
    $msgArray = json_decode($msg, true);
    $state = $msgArray['state'] ?? null;
    if ($state == 'listening') {
        if ($this->listening) {//then this is the 2nd time listening, which means audio processing has finished and has already been sent by server and received by this client.
            Log::debug("FINAL RESPONSE: " . str_limit($this->responseJson, 500));
            $conn->close(RatchetRFC6455MessagingFrame::CLOSE_NORMAL, 'Finished.'); 
            $callback($this->responseJson);
        } else {
            $this->listening = true;
            Log::debug("Received first 'listening' message.");
        }
    } else {
        $this->responseJson = $msg;
    }
}

public function sendBinaryMessage($conn, $fileContents) {
    $chunkSizeInBytes = 100; //probably just needs to be <= 4 MB according to Watson's rules
    $chunks = str_split($fileContents, $chunkSizeInBytes);
    Log::debug('Split audio into this many frames: ' . count($chunks));
    $final = true;
    foreach ($chunks as $key => $chunk) {
        $frame = new RatchetRFC6455MessagingFrame($chunk, $final, RatchetRFC6455MessagingFrame::OP_BINARY);
        $conn->send($frame);
    }

}

推荐答案

作为一般建议,基于文件的识别,尤其是当文件大于几 MB 时,应该使用 Watson /recognitions API(这里有更多详细信息:https://cloud.ibm.com/apidocs/speech-to-text),这是异步的.您不需要将连接保持打开几个小时,这不是一个好习惯,因为您可能会遇到读取超时,您可能会丢失网络连接等.通过异步执行,您 POST 文件然后连接结束,然后您可以每 X 分钟获取一次状态,或者通过回调收到通知,任何对您更有效的方法.

As a general recommendation, file based recognition, and especially if the files are bigger than a few MBs, should be done using the Watson /recognitions API (more details here: https://cloud.ibm.com/apidocs/speech-to-text), which is asynchronous. You do not need to keep a connection open for a few hours, that is not a good practice since you could run into a read timeout, you could lose network connectivity, etc. By doing it asynchronously you POST the file and then the connection ends, then you can GET status every X minutes, or be notified via callback, whatever works better for you.

curl -X POST -u "apikey:{apikey}" --header "Content-Type: audio/flac" --data-binary @audio-file.flac "https://stream.watsonplatform.net/speech-to-text/api/v1/recognitions?callback_url=http://{user_callback_path}/job_results&user_token=job25&timestamps=true"

顺便说一句.您的 websockets 客户端是否使用乒乓帧来保持连接活跃?我注意到您没有请求临时结果 ({"content-type":"audio/mpeg","timestamps":true,"speaker_labels":true,"smart_formatting":true,"inactivity_timeout":-1,"interim_results":false,"max_alternatives":1,"word_confidence":false,"action":"start"}),这是另一种保持连接打开的方法,但不太可靠.请检查乒乓球框.

btw. is your websockets client using ping-pong frames to keep connections alive? I noticed that you do not request interim results ({"content-type":"audio/mpeg","timestamps":true,"speaker_labels":true,"smart_formatting":true,"inactivity_timeout":-1,"interim_results":false,"max_alternatives":1,"word_confidence":false,"action":"start"}), that is another way to keep a connection open, but less reliable. Please check the ping pong frames.

这篇关于IBM Watson 语音到文本 API 中的 1006 错误代码的文章就介绍到这了,希望我们推荐的答案对大家有所帮助,也希望大家多多支持IT屋!

查看全文
登录 关闭
扫码关注1秒登录
发送“验证码”获取 | 15天全站免登陆