Javax Websocket由于非法UTF-8序列而关闭 [英] Javax Websocket closing due to Illegal UTF-8 Sequence

查看:970
本文介绍了Javax Websocket由于非法UTF-8序列而关闭的处理方法,对大家解决问题具有一定的参考价值,需要的朋友们下面随着小编来一起学习吧!

问题描述

我正在使用javax.websocket API编写Websocket客户端,并使用 org.glassfish.tyrus 作为实现。



一切通常都有效,但有时候,当我收到非常大的字符串时,连接会以一个神秘的'非法UTF-8序列'作为结束原因关闭。

  log.info(Ws closed cuz:
+ reason.getCloseCode()+,
+ reason.getReasonPhrase()+ ,
+ reason.toString());

输出:

  INFO:Ws closed cuz:NOT_CONSISTENT,Illegal UTF-8 Sequence,
CloseReason [1007,Illegal UTF-8 Sequence]

我猜测字符串太大,或者字符串中包含任何不兼容UTF-8的字符。



有没有办法获得有关导致此问题的实际字符串/数据包/帧的更多信息?或者,如果有办法告诉tyrus忽略任何编码问题,只是传递给我原始字符串,并让我处理它?<​​/ p>

如果不是,是否有另一个java websockets客户端,它的工作原理是将字符串通过套接字传输并且不做任何验证,并且让我处理响应?



欣赏任何反馈。
(1)关于服务器端,大字符串被分割成一个文本框架和一个或多个后续的连续框架。从技术上讲,原始的大字符串被转换成一个字节数组,然后字节数组被分割成多个子字节数组。子数组被逐个设置为帧(=每个帧包含一个子字节数组)。

<2>虽然不能保证每个子字节数组都是一个有效的UTF-8序列,有效性检查既可以在服务器端进行,也可以在客户端进行。如果是这样,这是Tyrus的一个bug。


WebSocketListener of nv-websocket-client 具有帧粒度的回调方法,例如 onFrame onTextFrame onContinuationFrame 等等(注意 onTextMessage onTextFrame 是不同的),所以你可以检查每个帧的字节数组。

  WebSocket websocket = new WebSocketFactory()
.createSocket(ws:// ...)
.addListener(new WebSocketAdapter(){
@Override
public void onFrame(WebSocket ws,WebSocketFrame frame){
//如果帧是FIN位清零的文本帧,或
// i f该帧是一个连续帧。 $(b)if((frame.isTextFrame()& frame.getFin()== false)||
frame.isContinuationFrame()){
//帧的有效载荷。没有保证
//这个字节数组是一个有效的UTF-8序列。
byte [] payload = frame.getPayload();

//如果需要,检查有效载荷是否为有效的UTF-8序列
//。
checkPayload(payload);
}
}
})
.connect();

为什么不使用 nv-websocket-client 来检查你的WebSocket连接发生了什么?


I'm writing a Websocket client in Java, using javax.websocket API, and org.glassfish.tyrus as the implementation.

Everything usually works, but sometimes, when I'm receiving very large strings, the connection closes with a mysterious 'Illegal UTF-8 Sequence' as the close reason.

log.info("Ws closed cuz: " 
   + reason.getCloseCode() + " , " 
   + reason.getReasonPhrase() + " , " 
   + reason.toString());

Output:

INFO: Ws closed cuz: NOT_CONSISTENT , Illegal UTF-8 Sequence ,
CloseReason[1007,Illegal UTF-8 Sequence]

I'm guessing that either the string was too large, or the string contained any characters which aren't UTF-8 compatible.

Is there a way to get any more info on the actual string / packet / frame which causes this issue? Or, if there's a way to tell tyrus to ignore any encoding issues and just pass me the raw string and let me handle it?

If not, is there another java websockets client which does the bare bones work of transmitting the strings over socket and doesn't do any validation, and just lets me handle the responses?

Appreciate any feedback.

解决方案

The following is just a guess.

(1) On the server side, the large string is split into one text frame and one or more following continuation frames. Technically, the original large string is converted into a byte array and then the byte array is split into multiple sub byte arrays. The sub arrays are set to frames one by one (= Each frame contains one sub byte array).

(2) Although there is no guarantee that each sub byte array is a valid UTF-8 sequence, validity check is performed either on the server side or on the client side. If so, it's a bug of Tyrus.

WebSocketListener of nv-websocket-client has callback methods in frame granularity such as onFrame, onTextFrame, onContinuationFrame and others (note that onTextMessage and onTextFrame are different), so you can examine the byte array of each frame there.

WebSocket websocket = new WebSocketFactory()
    .createSocket("ws://...")
    .addListener(new WebSocketAdapter() {
        @Override
        public void onFrame(WebSocket ws, WebSocketFrame frame) {
            // If the frame is a text frame with FIN bit cleared, or
            // if the frame is a continuation frame.
            if ((frame.isTextFrame() && frame.getFin() == false) ||
                frame.isContinuationFrame()) {
                // The payload of the frame. There is no guarantee
                // that this byte array is a valid UTF-8 sequence.
                byte[] payload = frame.getPayload();

                // Check whether the payload is a valid UTF-8 sequence
                // if you want to.
                checkPayload(payload);
            }
        }
    })
    .connect();

Why don't you use nv-websocket-client to examine what is happening in your WebSocket connection?

这篇关于Javax Websocket由于非法UTF-8序列而关闭的文章就介绍到这了,希望我们推荐的答案对大家有所帮助,也希望大家多多支持IT屋!

查看全文
登录 关闭
扫码关注1秒登录
发送“验证码”获取 | 15天全站免登陆