Python list.pop(i)时间复杂度? [英] Python list.pop(i) time complexity?

查看:77
本文介绍了Python list.pop(i)时间复杂度?的处理方法,对大家解决问题具有一定的参考价值,需要的朋友们下面随着小编来一起学习吧!

问题描述

我在线查找并知道 list.pop()具有O(1)时间复杂度,但是 list.pop(i)具有O(n)时间复杂度.当我编写leetcode时,许多人在for循环中使用 pop(i),他们说这是O(n)时间复杂度,实际上它比我的仅使用一个循环的代码要快.但该循环中有很多行.我不知道为什么会发生这种情况,我应该使用 pop(i)而不是很多行来避免它吗?

示例:Leetcode 26.从排序数组中删除重复项

我的代码:(高于75%)

  class解决方案(对象):def removeDuplicates(self,nums):":type nums:列表[int]:rtype:整数"左,右= 0、0计数= 1正确时<len(nums)-1:如果nums [right] == nums [right + 1]:右+ = 1别的:nums [left + 1] = nums [right + 1]左+ = 1右+ = 1计数+ = 1退货计数 

和其他人的代码,速度超过90%:(这个人没有说O(n),但是为什么O(n ^ 2)比我的O(n)快?)

计时代码:

  def linear_solution(nums):左,右= 0、0正确时<len(nums)-1:如果nums [right]!= nums [right + 1]:nums [left + 1] = nums [right + 1]左+ = 1右+ = 1返回左+ 1def quadratic_solution(nums):prev_obj = []对于范围为(len(nums)-1,-1,-1)的i:如果prev_obj == nums [i]:nums.pop(i)prev_obj = nums [i]返回len(nums)来自随机进口randint从timeit导入timeitdef gen_list(n):max_n = n//2返回sorted(randint(0,max_n)for range(n)中的i)#我使用1000到15000的步长,然后使用5000到50000的步长步长= 1000max_n = 15000次数= 100print('n','线性时间(ms)','二次时间(ms)',sep ='\ t')对于范围内的n(step,max_n + 1,step):#生成输入列表lsts1 = [对于范围(reps)中的i,为gen_list(n)]#按值复制列表,因为算法会对其进行突变lsts2 = [lsts1中g的list(g)]#使用迭代器将输入列表一一对应地提供给timeititer1 = iter(lsts1)iter2 = iter(lsts2)t1 = timeit(lambda:linear_solution(next(iter1)),number = reps)t2 = timeit(lambda:quadratic_solution(next(iter2)),number = reps)#timeit以秒为单位报告所有代表的总时间打印(n,1000 * t1/reps,1000 * t2/reps,sep ='\ t') 

结论是,对于足够大的输入,您的算法确实比二次求解快,但是LeetCode用于测量运行时间的输入不足以克服"判断开销的变化,并且平均包括在较小的输入上测得的时间,而在较小的输入上,二次算法更快.

I look up online and know that list.pop() has O(1) time complexity but list.pop(i) has O(n) time complexity. While I am writing leetcode, many people use pop(i) in a for loop and they say it is O(n) time complexity and in fact it is faster than my code, which only uses one loop but many lines in that loop. I wonder why this would happen, and should I use pop(i) instead of many lines to avoid it?

Example: Leetcode 26. Remove Duplicates from Sorted Array

My code: (faster than 75%)

class Solution(object):
    def removeDuplicates(self, nums):
        """
        :type nums: List[int]
        :rtype: int
        """
        left, right = 0, 0
        count = 1
        while right < len(nums)-1:
            if nums[right] == nums[right+1]:
                right += 1
            else:
                nums[left+1]=nums[right+1]
                left += 1
                right += 1
                count += 1
        return count

and other people's code, faster than 90%: (this guy does not say O(n), but why O(n^2) faster than my O(n)?)

https://leetcode.com/problems/remove-duplicates-from-sorted-array/discuss/477370/python-3%3A-straight-forward-6-lines-solution-90-faster-100-less-memory

My optimized code (faster than 89%)

class Solution(object):
    def removeDuplicates(self, nums):
        """
        :type nums: List[int]
        :rtype: int
        """
        left, right = 0, 0
        while right < len(nums)-1:
            if nums[right] != nums[right+1]:
                nums[left+1]=nums[right+1]
                left += 1
            right += 1
        return left + 1

解决方案

Your algorithm genuinely does take O(n) time and the "pop in reverse order" algorithm genuinely does take O(n²) time. However, LeetCode isn't reporting that your time complexity is better than 89% of submissions; it is reporting your actual running time is better than 89% of all submissions. The actual running time depends on what inputs the algorithm is tested with; not just the sizes but also the number of duplicates.

It also depends how the running times across multiple test cases are averaged; if most of the test cases are for small inputs where the quadratic solution is faster, then the quadratic solution may come out ahead overall even though its time complexity is higher. @Heap Overflow also points out in the comments that the overhead time of LeetCode's judging system is proportionally large and quite variable compared to the time it takes for the algorithms to run, so the discrepancy could simply be due to random variation in that overhead.

To shed some light on this, I measured running times using timeit. The graph below shows my results; the shapes are exactly what you'd expect given the time complexities, and the crossover point is somewhere between 8000 < n < 9000 on my machine. This is based on sorted lists where each distinct element appears on average twice. The code I used to generate the times is given below.

Timing code:

def linear_solution(nums):
    left, right = 0, 0
    while right < len(nums)-1:
        if nums[right] != nums[right+1]:
            nums[left+1]=nums[right+1]
            left += 1
        right += 1
    return left + 1

def quadratic_solution(nums):
    prev_obj = []
    for i in range(len(nums)-1,-1,-1):
        if prev_obj == nums[i]:
            nums.pop(i)
        prev_obj = nums[i]
    return len(nums)

from random import randint
from timeit import timeit

def gen_list(n):
    max_n = n // 2
    return sorted(randint(0, max_n) for i in range(n))

# I used a step size of 1000 up to 15000, then a step size of 5000 up to 50000
step = 1000
max_n = 15000
reps = 100

print('n', 'linear time (ms)', 'quadratic time (ms)', sep='\t')
for n in range(step, max_n+1, step):
    # generate input lists
    lsts1 = [ gen_list(n) for i in range(reps) ]
    # copy the lists by value, since the algorithms will mutate them
    lsts2 = [ list(g) for g in lsts1 ]
    # use iterators to supply the input lists one-by-one to timeit
    iter1 = iter(lsts1)
    iter2 = iter(lsts2)
    t1 = timeit(lambda: linear_solution(next(iter1)), number=reps)
    t2 = timeit(lambda: quadratic_solution(next(iter2)), number=reps)
    # timeit reports the total time in seconds across all reps
    print(n, 1000*t1/reps, 1000*t2/reps, sep='\t')

The conclusion is that your algorithm is indeed faster than the quadratic solution for large enough inputs, but the inputs LeetCode is using to measure running times are not "large enough" to overcome the variation in the judging overhead, and the fact that the average includes times measured on smaller inputs where the quadratic algorithm is faster.

这篇关于Python list.pop(i)时间复杂度?的文章就介绍到这了,希望我们推荐的答案对大家有所帮助,也希望大家多多支持IT屋!

查看全文
登录 关闭
扫码关注1秒登录
发送“验证码”获取 | 15天全站免登陆