在Delphi 7中处理海量文本文件数据的最佳解决方案 [英] The best solution to process the huge text file data in Delphi 7

查看:119
本文介绍了在Delphi 7中处理海量文本文件数据的最佳解决方案的处理方法,对大家解决问题具有一定的参考价值,需要的朋友们下面随着小编来一起学习吧!

问题描述

我有这样的文本文件:

"01","AAA","AAAAA" 
"02","BBB","BBBBB","BBBBBBBB" 
"03","CCC" 
"04","DDD","DDDDD"

我想将此文本文件数据加载到sybase db中的temp表中。因此,我需要构建一个程序来逐行读取该文本文件直到eof。如果文本文件很小,则逐行读取的过程将很快。但是,如果文本文件太大(可能超过500M),则逐行读取的过程太慢。我认为逐行读取方法不适用于巨大的文本文件。因此,需要找到其他解决方案以将文本文件数据加载到db中,而不是逐行读取文本文件方法。有什么建议吗?
示例代码:

I want to load this text file data into temp table in sybase db. So, I need to build a program to read line by line this text file until eof. If the text file size is small, the process to read line by line is fast. But if text file size is too big (can be more than 500M), the process read line by line is too slow. I think the read line by line method not suitable for huge text file. So, need to find other solution to load text file data into db instead of read text file line by line method. Any suggestion? Example code:

var
  myFile : TextFile;
  text   : string;

begin
  // Try to open the Test.txt file for writing to
  AssignFile(myFile, 'Test.txt');

  // Display the file contents
  while not Eof(myFile) do
  begin
    ReadLn(myFile, text);
    TempTable.append;
    TempTable.FieldByName('Field1').asstring=Copy(text,2,2);
    TempTable.FieldByName('Field2').asstring=Copy(text,7,3);
    TempTable.FieldByName('Field3').asstring=Copy(text,13,5);
    TempTable.FieldByName('Field4').asstring=Copy(text,21,8);
    TempTable.post;
  end;

  // Close the file for the last time
  CloseFile(myFile);
end;


推荐答案

一些常规提示:


  • 确保您的 TempTable 在内存中,或使用快速的数据库引擎-查看SQlite3或其他方式(例如FireBird Embedded,NexusDB或ElevateDB)作为数据库的替代方案;

  • 如果您不使用 TTable ,而是一个真实的数据库,请确保将插入嵌套在Transaction 中;

  • 对于真正的数据库,请检查是否不能使用 ArrayDML 功能,对于在远程数据库(如Sybase)中插入所需的大量数据,这要快得多- FireDAC AFAIK处理此类数组DML;

  • FieldByName('...' )方法非常慢:改用locals TField 变量;

  • 使用TextFile时分配更大的临时增益er;

  • 如果您正在使用Delphi(2009+)的较新Unicode版本,则使用TextFile不是最佳选择。

  • Ensure your TempTable is in memory, or use a fast database engine - take a look at SQlite3 or other means (like FireBird embedded, NexusDB or ElevateDB) as possible database alternatives;
  • If you do not use a TTable, but a true database, ensure you nest the insert within a Transaction;
  • For a true database, check out if you can not use ArrayDML feature, which is much faster for inserting a lot of data as you want in a remote database (like Sybase) - such Array DML is handled for instance with FireDAC AFAIK;
  • The FieldByName('...') method is known to be very slow: use locals TField variables instead;
  • When using a TextFile, assign a bigger temporary buffer;
  • If you are using newer Unicode versions of Delphi (2009+), using TextFile is not the best option.

所以您的代码可能是:

var
  myFile : TextFile;
  myFileBuffer: array[word] of byte;
  text   : string;
  Field1, Field2, Field3, Field4: TField;
begin

  // Set Field* local variables for speed within the main loop
  Field1 := TempTable.FieldByName('Field1');
  Field2 := TempTable.FieldByName('Field2');
  Field3 := TempTable.FieldByName('Field3');
  Field4 := TempTable.FieldByName('Field4');

  // Try to open the Test.txt file for writing to
  AssignFile(myFile, 'Test.txt');
  SetTextBuf(myFile, myFileBuffer); // use 64 KB read buffer

  // Display the file contents
  while not Eof(myFile) do
  begin
    ReadLn(myFile, text);
    TempTable.append;
    Field1.asInteger := StrToInt(Copy(text,2,2));
    Field2.asString := Copy(text,7,3);
    Field3.asString := Copy(text,13,5);
    Field4.asString := Copy(text,21,8);
    TempTable.post;
  end;

  // Close the file for the last time
  CloseFile(myFile);
end;

您可以使用嵌入式引擎实现非常高的速度,几乎没有大小限制,但是可以存储。例如,请参见我们可以多快在我们的ORM中的 SQLite3 数据库中添加内容:数据库文件中每秒大约130,000 / 150,000行,包括所有ORM编组。我还发现, SQLite3 生成的数据库文件比替代文件小得多。如果要快速检索任何字段,请不要忘记在数据库中定义 INDEXes ,如果可能的话,请在插入行数据后 (为了提高速度)。对于 SQLite3 ,我想已经有一个 ID / RowID 整数主键,它映射了您的第一个数据字段。这个 ID / RowID 整数主键已经被 SQLite3 索引。顺便说一句,我们的ORM现在支持 FireDAC / AnyDAC及其高级Array DML功能

You can achieve very high speed with embedded engines, with almost no size limit, but your storage. See for instance how fast we can add content to a SQLite3 database in our ORM: about 130,000 / 150,000 rows per second in a database file, including all ORM marshalling. I also found out that SQLite3 generates much smaller database files than alternatives. If you want fast retrieval of any field, do not forget to define INDEXes in your database, if possible after the insertion of row data (for better speed). For SQLite3, there is already an ID/RowID integer primary key available, which maps your first data field, I suppose. This ID/RowID integer primary key is already indexed by SQLite3. By the way, our ORM now supports FireDAC / AnyDAC and its advanced Array DML feature.

这篇关于在Delphi 7中处理海量文本文件数据的最佳解决方案的文章就介绍到这了,希望我们推荐的答案对大家有所帮助,也希望大家多多支持IT屋!

查看全文
登录 关闭
扫码关注1秒登录
发送“验证码”获取 | 15天全站免登陆