使用delphi 2010读取一个字节(一个字节一个字节)的文本文件 [英] Reading a text file as bytes (byte by byte) using delphi 2010

查看:393
本文介绍了使用delphi 2010读取一个字节(一个字节一个字节)的文本文件的处理方法,对大家解决问题具有一定的参考价值,需要的朋友们下面随着小编来一起学习吧!

问题描述

我想逐字节读取UTF-8文本文件,并获取文件中每个字节的ascii值表示形式。能做到吗?如果是这样,什么是最好的方法?

I would like to read a UTF-8 text file byte by byte and get the ascii value representation of each byte in the file. Can this be done? If so, what is the best method?

我的目标是用一个字节替换我找到的2个字节组合(这些是我已经准备好的设置条件)

My goal is to then replace 2 byte combinations that i find with one byte (these are set conditions that I have prepared)

例如,如果我发现197后跟158(十进制表示形式),我将其替换为单个字节17

for example, If I find a 197 followed by a 158 (decimal representations), i will replace it with a single byte 17

我不想使用标准的delphi IO操作

I don't want to use the standard delphi IO operations

AssignFile
ReSet
ReWrite(OutFile);
ReadLn
WriteLn
CloseFile

是否有更好的方法?可以使用 TStream (读写器)完成吗?

Is there a better method? Can this be done using TStream (Reader & Writer)?

这里是我正在使用的示例测试。我知道从第84列开始有一个字符(350)(两个字节)。在十六进制编辑器中查看时,该字符由197 + 158组成-因此,我试图使用我的delphi代码查找198,但似乎无法找到它

Here is an example test I am using. I know there is a character (350) (two bytes) starting in column 84. When viewed in a hex editor, the character consists of 197 + 158 - so i am trying to find the 198 using my delphi code and can't seem to find it

FS1:= TFileStream.Create(ParamStr1, fmOpenRead);
try
 FS1.Seek(0, soBeginning);
 FS1.Position:= FS1.Position + 84;
 FS1.Read(B, SizeOf(B));
 if ord(B) = 197 then showMessage('True') else ShowMessage('False');
finally
 FS1.Free;
end;


推荐答案

我的理解是您要转换文本文件从UTF-8到ASCII。这很简单:

My understanding is that you want to convert a text file from UTF-8 to ASCII. That's quite simple:

StringList.LoadFromFile(UTF8FileName, TEncoding.UTF8);
StringList.SaveToFile(ASCIIFileName, TEncoding.ASCII);

运行时库具有各种功能,可以在不同的文本编码之间进行转换。您肯定不想自己复制此功能吗?

The runtime library comes with all sorts of functionality to convert between different text encodings. Surely you don't want to attempt to replicate this functionality yourself?

我相信您已经意识到这种转换很容易丢失数据。序数大于127的字符不能用ASCII表示。实际上,在UTF-8中每个需要超过1个八位位组的代码点都不能用ASCII表示。

I trust you realise that this conversion is liable to lose data. Characters with ordinal greater than 127 cannot be represented in ASCII. In fact every code point that requires more than 1 octet in UTF-8 cannot be represented in ASCII.

这篇关于使用delphi 2010读取一个字节(一个字节一个字节)的文本文件的文章就介绍到这了,希望我们推荐的答案对大家有所帮助,也希望大家多多支持IT屋!

查看全文
登录 关闭
扫码关注1秒登录
发送“验证码”获取 | 15天全站免登陆