高效的字符串查找? [英] Efficient String Lookup?
问题描述
我有很多字符串,包含通配符(例如''abc#e#'',其中#
是什么),我想与测试字符串匹配(例如''abcdef' ')。
我最好的方法是存储我的字符串,以便尽可能快地查找
?
Chris S.写道:我有很多字符串,包含通配符(例如''abc#e#'',其中#
是什么) ,我想与测试字符串匹配(例如''abcdef'')。
我最好的方法是存储我的字符串,以便尽可能快地查找?
如果是我,我会将它们存储为已编译的正则表达式。
请参阅re模块文档并使用re.compile( )。
如果你想要一个更好的解决方案,如果你提供更多关于你的问题的信息以及为什么这个解决方案不合适,那么它可能会有所帮助
(也许是:])。
-
Michael Hoffman
Chris S.写道:
我有许多字符串,包含通配符(例如''abc#e#''其中#
是什么),我想与测试字符串匹配(例如''abcdef'')。
什么是我存储我的最佳方式字符串所以查找尽可能快?
作为一个编译的正则表达式,我想 - 你不会在这里给出太多信息,
所以也许有更好的方法。但对我来说,它看起来像一个经典的正则表达式。
的东西。也许如果你的通配符相当于。*,那么使用后续的
字符串搜索,这可以帮助你:
pattern =''abc#e#''。拆分(''#'')
s =''abcdef''
found = True
pos = 0
for p in pattern:
h = s.find(p)
如果h!= -1:
p = h + 1b
else:
found = False
如果string.find操作使用的不是
$ b $,那可能会更快b简单的蛮力线性搜索 - 但我不太了解python'的字符串实现的
内部,以给出一个明确的答案
这里。 />
但说实话:我不认为正则表达式很容易被击败,除非你的
用例被建模的方式使其易于采用其他方法。
-
问候,
Diez B. Roggisch
我h ave包含通配符的多个字符串(例如''abc#e#''其中#
是什么),我想与测试字符串匹配(例如''abcdef'')。
什么是我存储我的最佳方式字符串所以查找尽可能快?
从Trie开始,并根据需要虚拟合并分支。
- Josiah
I have a number of strings, containing wildcards (e.g. ''abc#e#'' where #
is anything), which I want to match with a test string (e.g ''abcdef'').
What would be the best way for me to store my strings so lookup is as
fast as possible?
Chris S. wrote:I have a number of strings, containing wildcards (e.g. ''abc#e#'' where #
is anything), which I want to match with a test string (e.g ''abcdef'').
What would be the best way for me to store my strings so lookup is as
fast as possible?
If it were me, I would store them as compiled regular expressions.
See the re module documentation and use re.compile().
If you want a better solution, it might help if you supply a little more
information about your problem and why this solution is unsuitable
(maybe it is :]).
--
Michael Hoffman
Chris S. wrote:
I have a number of strings, containing wildcards (e.g. ''abc#e#'' where #
is anything), which I want to match with a test string (e.g ''abcdef'').
What would be the best way for me to store my strings so lookup is as
fast as possible?
As a compiled regular expression, I guess - you don''t give much info here,
so maybe there is a better way. But to me it looks like a classic regexp
thing. Maybe if your wildcards are equivalent to .*, then using subsequent
string searches lik this helps you:
pattern = ''abc#e#''.split(''#'')
s = ''abcdef''
found = True
pos = 0
for p in pattern:
h = s.find(p)
if h != -1:
p = h + 1b
else:
found = False
That might be faster, if the string.find operation uses something else than
simple brute force linear searching - but I don''t know enough about the
internals of python''s string implementation to give an definite answer
here.
But to be honest: I don''t think regexps are easy to beat, unless your
usecase is modeled in a way that makes it prone to other approaches.
--
Regards,
Diez B. Roggisch
I have a number of strings, containing wildcards (e.g. ''abc#e#'' where #
is anything), which I want to match with a test string (e.g ''abcdef'').
What would be the best way for me to store my strings so lookup is as
fast as possible?
Start with a Trie, and virtually merge branches as necessary.
- Josiah
这篇关于高效的字符串查找?的文章就介绍到这了,希望我们推荐的答案对大家有所帮助,也希望大家多多支持IT屋!