空缓冲区,但IdTCPClient.IOHandler.InputBufferIsEmpty为false [英] empty buffer but IdTCPClient.IOHandler.InputBufferIsEmpty is false
问题描述
我在下面的代码中使用idTCPClient从远程登录服务器读取缓冲区时遇到问题:
I have problem in below code with idTCPClient for reading buffer from a telnet server:
procedure TForm2.ReadTimerTimer(Sender: TObject);
var
S: String;
begin
if IdTCPClient.IOHandler.InputBufferIsEmpty then
begin
IdTCPClient.IOHandler.CheckForDataOnSource(10);
if IdTCPClient.IOHandler.InputBufferIsEmpty then Exit;
end;
s := idTCPClient.IOHandler.InputBufferAsString(TEncoding.UTF8);
CheckText(S);
end;
此过程每1000毫秒运行一次,当缓冲区的值称为CheckText时.
this procedure run every 1000 milliseconds and when the buffer have a value CheckText called.
此代码有效,但有时这会将空缓冲区返回给CheckText.
this code works but sometimes this return the empty buffer to CheckText.
出什么问题了?
谢谢
推荐答案
您的代码正在尝试从InputBuffer
读取任意数据块,并希望它们是完整有效的字符串.这样做时,您不会任何考虑收到的数据类型.那是多层次灾难的秘诀.
Your code is attempting to read arbitrary blocks of data from the InputBuffer
and expects them to be complete and valid strings. It is doing this without ANY consideration for what kind of data you are receiving. That is a recipe for disaster on multiple levels.
您已连接到Telnet服务器,但是您直接使用TIdTCPClient
而不是使用TIdTelnet
,因此,必须必须手动解码之前收到的所有Telnet序列. strong>然后,您可以处理所有剩余的字符串数据.查看TIdTelnet
的源代码.在触发OnDataAvailable
事件之前,发生了许多解码逻辑.所有Telnet序列数据都在内部处理,然后OnDataAvailable
事件提供解码后剩下的所有非Telnet数据.
You are connected to a Telnet server, but you are using TIdTCPClient
directly instead of using TIdTelnet
, so you MUST manually decode any Telnet sequences that are received BEFORE you can then process any remaining string data. Look at the source code for TIdTelnet
. There is a lot of decoding logic that takes place before the OnDataAvailable
event is fired. All Telnet sequence data is handled internally, then the OnDataAvailable
event provides whatever non-Telnet data is left over after decoding.
处理完Telnet解码后,您还需要注意的另一个问题是TEncoding.UTF8
仅处理正确编码的 COMPLETE UTF-8序列.如果遇到编码错误的序列,或更重要的是遇到不完整的序列,则整个解码失败,并返回空白字符串.这已经被报告为错误(请参见 QC#79042 ).
Once you have Telnet decoding taken care of, another problem you have to watch out for is that TEncoding.UTF8
only handles properly encoded COMPLETE UTF-8 sequences. If it encounters a badly encoded sequence, or more importantly encounters an incomplete sequence, THE ENTIRE DECODE FAILS and it returns a blank string. This has already been reported as a bug (see QC #79042).
CheckForDataOnSource()
将此时套接字中的所有原始字节存储到InputBuffer
中. InputBufferAsString()
提取当时InputBuffer
中的任何原始字节,并尝试使用指定的编码对其进行解码.调用InputBufferAsString()
时InputBuffer
中的原始字节很可能并且不总是包含 COMPLETE UTF-8序列.有时InputBuffer
中的最后一个序列有时仍在等待字节到达套接字,并且直到下一次调用CheckForDataOnSource()
时才会读取它们.那可以解释为什么使用TEncoding.UTF8
时CheckText()
函数接收空白字符串.
CheckForDataOnSource()
stores whatever raw bytes are in the socket at that moment into the InputBuffer
. InputBufferAsString()
extracts whatever raw bytes are in the InputBuffer
at that moment and attempts to decode them using the specified encoding. It is very possible and likely that the raw bytes that are in the InputBuffer
when you call InputBufferAsString()
do not always contain COMPLETE UTF-8 sequences. Chances are that sometimes the last sequence in the InputBuffer
is still waiting for bytes to arrive in the socket and they will not be read until the next call to CheckForDataOnSource()
. That would explain why your CheckText()
function is receiving blank strings when using TEncoding.UTF8
.
您应该改用IndyUTF8Encoding()
(Indy实现了自己的UTF-8编码器/解码器,以避免TEncoding.UTF8
中的解码错误).至少,您将不再获得空字符串,但是,当UTF-8序列跨越多个CheckForDataOnSource()
调用时,仍然会丢失数据(不完整的UTF-8序列将转换为?
字符).仅出于这个原因,在这种情况下您不应该使用InputBufferAsString()
(即使TEncoding.UTF8
确实可以正常工作).要正确处理此问题,您应该:
You should use IndyUTF8Encoding()
instead (Indy implements its own UTF-8 encoder/decoder to avoid the decoding bug in TEncoding.UTF8
). At the very least, you will not get blank strings anymore, however you can still lose data when a UTF-8 sequence spans multiple CheckForDataOnSource()
calls (incomplete UTF-8 sequences will be converted to ?
characters). For that reason alone, you should not be using InputBufferAsString()
in this situation (even if TEncoding.UTF8
did work properly). To handle this properly, you should either:
1)手动扫描InputBuffer
,计算仅构成 COMPLETE 个UTF-8序列的字节,然后将该计数传递给InputBuffer.Extract()
或TIdIOHandler.ReadString()
.任何剩余的字节将在下一次保留在InputBuffer
中.为此,您必须摆脱第一个InputBufferIsEmpty()
调用,而无条件地调用CheckForDataOnSource()
,这样即使您已经有一些字节,也总是要检查更多的字节.
1) scan through the InputBuffer
manually, calculating how many bytes constitute COMPLETE UTF-8 sequences only, and then pass that count to InputBuffer.Extract()
or TIdIOHandler.ReadString()
. Any left over bytes will remain in the InputBuffer
for the next time. For that to work, you will have to get rid of the first InputBufferIsEmpty()
call and just call CheckForDataOnSource()
unconditionally so that you are always checking for more bytes even if you already have some.
2)改用TIdIOHandler.ReadChar()
并完全摆脱对InputBufferIsEmpty()
和CheckForDataOnSource()
的调用.不利的一面是,如果UTF-8序列解码为UTF-16代理对,则会丢失数据. ReadChar()
可以解码代理,但不能返回该对中的第二个字符(我已开始处理新的ReadChar()
重载,以供Indy的将来版本使用,该超载返回String
而不是Char
,因此可以使用完整的代理对返回).
2) use TIdIOHandler.ReadChar()
instead and get rid of the calls to InputBufferIsEmpty()
and CheckForDataOnSource()
altogether. The downside is that you will lose data if a UTF-8 sequence decodes into a UTF-16 surrogate pair. ReadChar()
can decode surrogates, but it cannot return the second character in the pair (I have started working on new ReadChar()
overloads for a future release of Indy that return String
instead of Char
so full surrogate pairs can be returned).
这篇关于空缓冲区,但IdTCPClient.IOHandler.InputBufferIsEmpty为false的文章就介绍到这了,希望我们推荐的答案对大家有所帮助,也希望大家多多支持IT屋!