空缓冲区,但IdTCPClient.IOHandler.InputBufferIsEmpty为false [英] empty buffer but IdTCPClient.IOHandler.InputBufferIsEmpty is false

查看:219
本文介绍了空缓冲区,但IdTCPClient.IOHandler.InputBufferIsEmpty为false的处理方法,对大家解决问题具有一定的参考价值,需要的朋友们下面随着小编来一起学习吧!

问题描述

我在下面的代码中使用idTCPClient从远程登录服务器读取缓冲区时遇到问题:

I have problem in below code with idTCPClient for reading buffer from a telnet server:

procedure TForm2.ReadTimerTimer(Sender: TObject);
var
   S: String; 
begin
   if IdTCPClient.IOHandler.InputBufferIsEmpty then
   begin
     IdTCPClient.IOHandler.CheckForDataOnSource(10);
     if IdTCPClient.IOHandler.InputBufferIsEmpty then Exit;
   end;
   s := idTCPClient.IOHandler.InputBufferAsString(TEncoding.UTF8);
   CheckText(S);
end;

此过程每1000毫秒运行一次,当缓冲区的值称为CheckText时.

this procedure run every 1000 milliseconds and when the buffer have a value CheckText called.

此代码有效,但有时这会将空缓冲区返回给CheckText.

this code works but sometimes this return the empty buffer to CheckText.

出什么问题了?

谢谢

推荐答案

您的代码正在尝试从InputBuffer读取任意数据块,并希望它们是完整有效的字符串.这样做时,您不会任何考虑收到的数据类型.那是多层次灾难的秘诀.

Your code is attempting to read arbitrary blocks of data from the InputBuffer and expects them to be complete and valid strings. It is doing this without ANY consideration for what kind of data you are receiving. That is a recipe for disaster on multiple levels.

您已连接到Telnet服务器,但是您直接使用TIdTCPClient而不是使用TIdTelnet,因此,必须必须手动解码之前收到的所有Telnet序列. strong>然后,您可以处理所有剩余的字符串数据.查看TIdTelnet的源代码.在触发OnDataAvailable事件之前,发生了许多解码逻辑.所有Telnet序列数据都在内部处理,然后OnDataAvailable事件提供解码后剩下的所有非Telnet数据.

You are connected to a Telnet server, but you are using TIdTCPClient directly instead of using TIdTelnet, so you MUST manually decode any Telnet sequences that are received BEFORE you can then process any remaining string data. Look at the source code for TIdTelnet. There is a lot of decoding logic that takes place before the OnDataAvailable event is fired. All Telnet sequence data is handled internally, then the OnDataAvailable event provides whatever non-Telnet data is left over after decoding.

处理完Telnet解码后,您还需要注意的另一个问题是TEncoding.UTF8仅处理正确编码的 COMPLETE UTF-8序列.如果遇到编码错误的序列,或更重要的是遇到不完整的序列,则整个解码失败,并返回空白字符串.这已经被报告为错误(请参见 QC#79042 ).

Once you have Telnet decoding taken care of, another problem you have to watch out for is that TEncoding.UTF8 only handles properly encoded COMPLETE UTF-8 sequences. If it encounters a badly encoded sequence, or more importantly encounters an incomplete sequence, THE ENTIRE DECODE FAILS and it returns a blank string. This has already been reported as a bug (see QC #79042).

CheckForDataOnSource()将此时套接字中的所有原始字节存储到InputBuffer中. InputBufferAsString()提取当时InputBuffer 中的任何原始字节,并尝试使用指定的编码对其进行解码.调用InputBufferAsString()InputBuffer中的原始字节很可能并且不总是包含 COMPLETE UTF-8序列.有时InputBuffer中的最后一个序列有时仍在等待字节到达套接字,并且直到下一次调用CheckForDataOnSource()时才会读取它们.那可以解释为什么使用TEncoding.UTF8CheckText()函数接收空白字符串.

CheckForDataOnSource() stores whatever raw bytes are in the socket at that moment into the InputBuffer. InputBufferAsString() extracts whatever raw bytes are in the InputBuffer at that moment and attempts to decode them using the specified encoding. It is very possible and likely that the raw bytes that are in the InputBuffer when you call InputBufferAsString() do not always contain COMPLETE UTF-8 sequences. Chances are that sometimes the last sequence in the InputBuffer is still waiting for bytes to arrive in the socket and they will not be read until the next call to CheckForDataOnSource(). That would explain why your CheckText() function is receiving blank strings when using TEncoding.UTF8.

您应该改用IndyUTF8Encoding()(Indy实现了自己的UTF-8编码器/解码器,以避免TEncoding.UTF8中的解码错误).至少,您将不再获得空字符串,但是,当UTF-8序列跨越多个CheckForDataOnSource()调用时,仍然会丢失数据(不完整的UTF-8序列将转换为?字符).仅出于这个原因,在这种情况下您不应该使用InputBufferAsString()(即使TEncoding.UTF8确实可以正常工作).要正确处理此问题,您应该:

You should use IndyUTF8Encoding() instead (Indy implements its own UTF-8 encoder/decoder to avoid the decoding bug in TEncoding.UTF8). At the very least, you will not get blank strings anymore, however you can still lose data when a UTF-8 sequence spans multiple CheckForDataOnSource() calls (incomplete UTF-8 sequences will be converted to ? characters). For that reason alone, you should not be using InputBufferAsString() in this situation (even if TEncoding.UTF8 did work properly). To handle this properly, you should either:

1)手动扫描InputBuffer,计算仅构成 COMPLETE 个UTF-8序列的字节,然后将该计数传递给InputBuffer.Extract()TIdIOHandler.ReadString().任何剩余的字节将在下一次保留在InputBuffer中.为此,您必须摆脱第一个InputBufferIsEmpty()调用,而无条件地调用CheckForDataOnSource(),这样即使您已经有一些字节,也总是要检查更多的字节.

1) scan through the InputBuffer manually, calculating how many bytes constitute COMPLETE UTF-8 sequences only, and then pass that count to InputBuffer.Extract() or TIdIOHandler.ReadString(). Any left over bytes will remain in the InputBuffer for the next time. For that to work, you will have to get rid of the first InputBufferIsEmpty() call and just call CheckForDataOnSource() unconditionally so that you are always checking for more bytes even if you already have some.

2)改用TIdIOHandler.ReadChar()并完全摆脱对InputBufferIsEmpty()CheckForDataOnSource()的调用.不利的一面是,如果UTF-8序列解码为UTF-16代理对,则会丢失数据. ReadChar()可以解码代理,但不能返回该对中的第二个字符(我已开始处理新的ReadChar()重载,以供Indy的将来版本使用,该超载返回String而不是Char,因此可以使用完整的代理对返回).

2) use TIdIOHandler.ReadChar() instead and get rid of the calls to InputBufferIsEmpty() and CheckForDataOnSource() altogether. The downside is that you will lose data if a UTF-8 sequence decodes into a UTF-16 surrogate pair. ReadChar() can decode surrogates, but it cannot return the second character in the pair (I have started working on new ReadChar() overloads for a future release of Indy that return String instead of Char so full surrogate pairs can be returned).

这篇关于空缓冲区,但IdTCPClient.IOHandler.InputBufferIsEmpty为false的文章就介绍到这了,希望我们推荐的答案对大家有所帮助,也希望大家多多支持IT屋!

查看全文
登录 关闭
扫码关注1秒登录
发送“验证码”获取 | 15天全站免登陆