Python - find()函数的代价 [英] Python - Cost of find() function
问题描述
Python有效的方法来检查是否非常大的字符串包含一个子字符串
但是,我没有看到使用find()那里。使用find()函数是否昂贵?什么是时间复杂性?
我看过维基页面,但没有找到find()那里。
https://wiki.python.org/moin/TimeComplexity
快速浏览源代码,看起来像 str.find
调用 stringlib_find_slice
from 这里最终会调用 fastsearch
。实际的算法解释这里 - 与python伪代码(我从阅读评论)。看起来像在最坏的情况下执行O(N * M)(与天真的方法一样),但可以做O(N / M)在某些情况下(其中N和M分别是字符串和子字符串的长度)和O(N)在常见的情况下是这样的:。
1 (不要引用我的话 - 我只是剔除文档)
I am looking for an effective way to check if a short string is in a long string. I saw some suggestions on this thread: Python efficient way to check if very large string contains a substring
However, I didn't see the use of find() there. Is it expensive to use find() function? What is the time complexity?
I've looked at the Wiki page but didn't find find() there. https://wiki.python.org/moin/TimeComplexity
Quickly perusing the source, it appears that str.find
calls stringlib_find_slice
from here which eventually calls fastsearch
. The actual algorithm is explained here -- with python pseudo-code (which I gleaned from reading the comments).
It looks like the implementation is in worst case O(N*M) (The same as a naive approach), but can do O(N/M) in some cases (where N and M are the lengths of the string and substring respectively), and O(N) in frequent cases1.
1(don't quote me on it -- I only skimmed the document)
这篇关于Python - find()函数的代价的文章就介绍到这了,希望我们推荐的答案对大家有所帮助,也希望大家多多支持IT屋!