检测'文本'文件类型(ANSI vs UTF-8) [英] Detecting 'text' file type (ANSI vs UTF-8)

查看:184
本文介绍了检测'文本'文件类型(ANSI vs UTF-8)的处理方法,对大家解决问题具有一定的参考价值,需要的朋友们下面随着小编来一起学习吧!

问题描述

我在Delphi(7)中写了一个应用程序(一个心理测试考试),它创建了一个标准的文本文件,即文件的类型是ANSI。有人已将程序移植到互联网上运行,可能使用Java,结果文本文件类型为UTF-8。



读取这些结果文件的程序将不得不读取Delphi创建的文件和通过Internet创建的文件。



虽然我可以将UTF-8文本转换为ANSI(使用狡猾命名的函数UTF8ToANSI),但我可以提前告诉我有哪些文件? p>

看到我自己的文件格式,我想最简单的方法来处理这个将是一个标记在文件中的一个已知的位置,这将告诉我程序的源码(Delphi / Internet),但这似乎是欺骗。



提前感谢

解决方案

如果UTF文件以UTF-8字节顺序标记(BOM)开头,这很容易:

 函数UTF8FileBOM(const FileName:string):boolean; 
var
txt:file;
bytes:数组[0..2]字节;
amt:integer;
begin

FileMode:= fmOpenRead;
AssignFile(txt,FileName);
重置(txt,1);

try
BlockRead(txt,bytes,3,amt);
result:=(amt = 3)和(bytes [0] = $ EF)和(bytes [1] = $ BB)和(bytes [2] = $ BF);
finally
CloseFile(txt);
结束

end;

否则,要困难得多。


I wrote an application (a psychological testing exam) in Delphi (7) which creates a standard text file - ie the file is of type ANSI.

Someone has ported the program to run on the Internet, probably using Java, and the resulting text file is of type UTF-8.

The program which reads these results files will have to read both the files created by Delphi and the files created via the Internet.

Whilst I can convert the UTF-8 text to ANSI (using the cunningly named function UTF8ToANSI), how can I tell in advance which kind of file I have?

Seeing as I 'own' the file format, I suppose the easiest way to deal with this would be to place a marker within the file at a known position which will tell me the source of the program (Delphi/Internet), but this seems to be cheating.

Thanks in advance.

解决方案

If the UTF file begins with the UTF-8 Byte-Order Mark (BOM), this is easy:

function UTF8FileBOM(const FileName: string): boolean;
var
  txt: file;
  bytes: array[0..2] of byte;
  amt: integer;
begin

  FileMode := fmOpenRead;
  AssignFile(txt, FileName);
  Reset(txt, 1);

  try
    BlockRead(txt, bytes, 3, amt);
    result := (amt=3) and (bytes[0] = $EF) and (bytes[1] = $BB) and (bytes[2] = $BF);
  finally    
    CloseFile(txt);
  end;

end;

Otherwise, it is much more difficult.

这篇关于检测'文本'文件类型(ANSI vs UTF-8)的文章就介绍到这了,希望我们推荐的答案对大家有所帮助,也希望大家多多支持IT屋!

查看全文
登录 关闭
扫码关注1秒登录
发送“验证码”获取 | 15天全站免登陆