在输入流中搜索字符串 [英] Searching for a string in an input stream

查看:41
本文介绍了在输入流中搜索字符串的处理方法,对大家解决问题具有一定的参考价值,需要的朋友们下面随着小编来一起学习吧!

问题描述

我有一个很大的二进制文件(很多GB,因此无法选择将其加载到内存中),我想搜索所有出现的字符串"icpf".

I have a large binary file (many gigabytes, so loading it into memory is not an option) that I want to search for all occurrences of the string "icpf".

我尝试使用std::search进行此操作,但被std::search仅适用于正向迭代器而不适用于输入迭代器的事实所困扰.

I tried using std::search for this, but just got bitten by the fact that std::search only works for forward iterators, not input iterators.

标准库是否为此提供了一种快速的替代方法?还是我需要对搜索进行手工编码(一次读取大块,然后在其中依次按std::search或在所有内容中,先按ignore直到"i",然后手动检查接下来的三个字符)?

Does the standard library provide a fast alternative for this? Or do I need to hand-code the search (either reading in chunks at a time then std::search on those, or ignore everything until an 'i' and then manually check the next three characters)?

推荐答案

标准库是否可以为此提供快速的替代方法?

Does the standard library provide a fast alternative for this?

尽管标准C ++库提供了搜索文本流的方法,但它没有提供可比的二进制流算法.

Although the standard C++ library offers ways to search text streams, it does not offer comparable algorithms for binary streams.

还是我需要对搜索进行手工编码(一次读取大块,然后在其中依次按std::search,或者忽略所有内容,直到'i',然后手动检查接下来的三个字符)?

Or do I need to hand-code the search (either reading in chunks at a time then std::search on those, or ignore everything until an 'i' and then manually check the next three characters)?

编码跳过并搜索"方法可能很棘手,因为很容易编写一种跳过条目的解决方案.例如,如果要在包含"icpicpf"的文件中查找"icpf",则一次处理一个字符的简单程序在丢弃"icpi"前缀后将找不到"icpf"后缀.

Coding the "skip and search" approach could be tricky, because it is easy to code a solution that skips entries. For example, if you are looking for "icpf" in a file containing "icpicpf", a simple program that processes one character at a time will fail to find "icpf" suffix after discarding "icpi" prefix.

如果您要自己编写代码,请考虑实施 Knuth–Morris–Pratt算法 .在线上有许多实现,并且它可以在流上正确运行,因为它一次只考虑一个字符,并且永远不会返回.

If you are going to code this yourself, consider implementing Knuth–Morris–Pratt algorithm. There are many implementations available online, and it operates correctly on streams, because it considers one character at a time, and never goes back.

这篇关于在输入流中搜索字符串的文章就介绍到这了,希望我们推荐的答案对大家有所帮助,也希望大家多多支持IT屋!

查看全文
登录 关闭
扫码关注1秒登录
发送“验证码”获取 | 15天全站免登陆