使用 Delphi 快速搜索以查看大文件中是否存在字符串 [英] Fast Search to see if a String Exists in Large Files with Delphi

查看:20
本文介绍了使用 Delphi 快速搜索以查看大文件中是否存在字符串的处理方法,对大家解决问题具有一定的参考价值,需要的朋友们下面随着小编来一起学习吧!

问题描述

我的程序中有一个 FindFile 例程,它会列出文件,但如果填写了包含文本"字段,那么它应该只列出包含该文本的文件.

I have a FindFile routine in my program which will list files, but if the "Containing Text" field is filled in, then it should only list files containing that text.

如果输入了包含文本"字段,那么我会搜索找到的每个文件中的文本.我目前这样做的方法是:

If the "Containing Text" field is entered, then I search each file found for the text. My current method of doing that is:

  var
    FileContents: TStringlist;

  begin
    FileContents.LoadFromFile(Filepath);
    if Pos(TextToFind, FileContents.Text) = 0 then
      Found := false
    else 
      Found := true;

上面的代码很简单,一般都可以正常运行.但它有两个问题:

The above code is simple, and it generally works okay. But it has two problems:

  1. 它无法处理非常大的文件(例如 300 MB)

  1. It fails for very large files (e.g. 300 MB)

我觉得它可以更快.这还不错,但是为什么要等 10 分钟搜索 1000 个文件,如果有一种简单的方法可以加快速度呢?

I feel it could be faster. It isn't bad, but why wait 10 minutes searching through 1000 files, if there might be a simple way to speed it up a bit?

我需要它来为 Delphi 2009 工作并搜索可能是也可能不是 Unicode 的文本文件.它只需要为文本文件工作.

I need this to work for Delphi 2009 and to search text files that may or may not be Unicode. It only needs to work for text files.

那么我怎样才能加快搜索速度并使其适用于非常大的文件?

So how can I speed this search up and also make it work for very large files?

奖励:我还想允许忽略大小写"选项.这是一个更难提高效率的方法.有什么想法吗?

Bonus: I would also want to allow an "ignore case" option. That's a tougher one to make efficient. Any ideas?

解决方案:

嗯,mghie 指出了我之前的问题 如何有效地读取 Delphi 中许多文件的前几行,正如我回答的那样,它是不同的,没有提供解决方案.

Well, mghie pointed out my earlier question How Can I Efficiently Read The First Few Lines of Many Files in Delphi, and as I answered, it was different and didn't provide the solution.

但他让我觉得我以前做过这件事,而且我也做过.我为大文件构建了一个块读取例程,将其分成 32 MB 的块.我用它来读取我的程序的输入文件,它可能很大.该例程运行良好且快速.所以第一步是对我正在查看的这些文件做同样的事情.

But he got me thinking that I had done this before and I had. I built a block reading routine for large files that breaks it into 32 MB blocks. I use that to read the input file of my program which can be huge. The routine works fine and fast. So step one is to do the same for these files I am looking through.

所以现在的问题是如何在这些块内有效地搜索.好吧,我确实有一个关于该主题的先前问题:是Delphi 中有一个高效的全词搜索功能吗? 和 RRUZ 向我指出了 SearchBuf 例程.

So now the question was how to efficiently search within those blocks. Well I did have a previous question on that topic: Is There An Efficient Whole Word Search Function in Delphi? and RRUZ pointed out the SearchBuf routine to me.

这也解决了奖励"问题,因为 SearchBuf 的选项包括全字搜索(该问题的答案)和 MatchCase/noMatchCase(奖励的答案).

That solves the "bonus" as well, because SearchBuf has options which include Whole Word Search (the answer to that question) and MatchCase/noMatchCase (the answer to the bonus).

所以我开始跑步了.再次感谢 SO 社区.

So I'm off and running. Thanks once again SO community.

推荐答案

这是一个与您之前的问题相关的问题 如何有效地读取 Delphi 中许多文件的前几行,同样的答案也适用.如果您不完全读取文件而是按块读取文件,则大文件不会造成问题.包含文本的文件也有很大的加速,因为您应该在第一次匹配时取消搜索.目前,即使要查找的文本位于前几行,您也可以阅读整个文件.

This is a problem connected with your previous question How Can I Efficiently Read The First Few Lines of Many Files in Delphi, and the same answers apply. If you don't read the files completely but in blocks then large files won't pose a problem. There's also a big speed-up to be had for files containing the text, in that you should cancel the search upon the first match. Currently you read the whole files even when the text to be found is in the first few lines.

这篇关于使用 Delphi 快速搜索以查看大文件中是否存在字符串的文章就介绍到这了,希望我们推荐的答案对大家有所帮助,也希望大家多多支持IT屋!

查看全文
登录 关闭
扫码关注1秒登录
发送“验证码”获取 | 15天全站免登陆