迭代字符串的时间复杂性实际上是O(n ^ 2)还是O(n)? [英] Is the time-complexity of iterative string append actually O(n^2), or O(n)?

查看:89
本文介绍了迭代字符串的时间复杂性实际上是O(n ^ 2)还是O(n)?的处理方法,对大家解决问题具有一定的参考价值,需要的朋友们下面随着小编来一起学习吧!

问题描述

我正在处理CTCI之外的问题。

I am working on a problem out of CTCI.

第1章的第三个问题是您是否使用了诸如

The third problem of chapter 1 has you take a string such as

'John Smith先生'

并要求您用<$替换中间空格c $ c>%20 :

'Mr%20John%20Smith'

作者使用Python提供此解决方案,称其为O(n):

The author offers this solution in Python, calling it O(n):

def urlify(string, length):
    '''function replaces single spaces with %20 and removes trailing spaces'''
    counter = 0
    output = ''
    for char in string:
        counter += 1
        if counter > length:
            return output
        elif char == ' ':
            output = output + '%20'
        elif char != ' ':
            output = output + char
    return output

我的问题:

我理解这是从左到右扫描实际字符串的O(n)。但是Python中的字符串不是一成不变的吗?如果我有一个字符串,并且使用 + 运算符向其中添加了另一个字符串,它是否分配了必要的空间,复制了原始空间,然后复制了追加内容

I understand that this is O(n) in terms of scanning through the actual string from left to right. But aren't strings in Python immutable? If I have a string and I add another string to it with the + operator, doesn't it allocate the necessary space, copy over the original, and then copy over the appending string?

如果我有每个长度为1的 n 个字符串的集合,则需要:

If I have a collection of n strings each of length 1, then that takes:

1 + 2 + 3 + 4 + 5 + ... + n = n(n + 1)/ 2

O(n ^ 2)时间,是吗?还是我在Python处理附加方式方面弄错了?

or O(n^2) time, yes? Or am I mistaken in how Python handles appending?

或者,如果您愿意教我如何钓鱼:我将如何为自己找到答案?我尝试向Google寻求官方消息一直没有成功。我发现 https://wiki.python.org/moin/TimeComplexity 但这不是

Alternatively, if you'd be willing to teach me how to fish: How would I go about finding this out for myself? I've been unsuccessful in my attempts to Google an official source. I found https://wiki.python.org/moin/TimeComplexity but this doesn't have anything on strings.

推荐答案

在CPython中,Python的标准实现中有一个实现细节,通常使它成为O(n ),在字节码评估循环要求<$ c的代码中实现$ c> + 或 + = 带有两个字符串操作数。如果Python检测到left参数没有其他引用,它将调用 realloc 来尝试通过调整字符串的大小来避免复制。这不是您应该依靠的东西,因为它是实现细节,并且因为如果 realloc 最终需要频繁移动字符串,则性能会降低到O(n ^ 2)

In CPython, the standard implementation of Python, there's an implementation detail that makes this usually O(n), implemented in the code the bytecode evaluation loop calls for + or += with two string operands. If Python detects that the left argument has no other references, it calls realloc to attempt to avoid a copy by resizing the string in place. This is not something you should ever rely on, because it's an implementation detail and because if realloc ends up needing to move the string frequently, performance degrades to O(n^2) anyway.

在没有怪异的实现细节的情况下,由于涉及的二次复制量,该算法为O(n ^ 2)。像这样的代码仅在具有可变字符串的语言(如C ++)中才有意义,即使在C ++中,您也想使用 + =

Without the weird implementation detail, the algorithm is O(n^2) due to the quadratic amount of copying involved. Code like this would only make sense in a language with mutable strings, like C++, and even in C++ you'd want to use +=.

这篇关于迭代字符串的时间复杂性实际上是O(n ^ 2)还是O(n)?的文章就介绍到这了,希望我们推荐的答案对大家有所帮助,也希望大家多多支持IT屋!

查看全文
登录 关闭
扫码关注1秒登录
发送“验证码”获取 | 15天全站免登陆