将项目插入不区分大小写的Python排序列表中 [英] Insert item into case-insensitive sorted list in Python

查看:159
本文介绍了将项目插入不区分大小写的Python排序列表中的处理方法,对大家解决问题具有一定的参考价值,需要的朋友们下面随着小编来一起学习吧!

问题描述

我有一个已经按不区分大小写顺序排序的字符串列表.我想在列表中插入一个新字符串.一种方法是添加项目,然后对列表进行排序,如下所示:

I have a list of strings that is already sorted in case-insensitive order. I would like to insert a new string into the list. One way to do this is to append the item and then sort the list, like so:

myList.append('Something')
myList.sort(key=lambda s: s.lower())

但是我想知道是否有一种方法可以将项目插入正确的位置,而无需再次对整个项目进行排序.

But I was wondering if there is a way to just insert the item in the correct position without sorting the whole thing again.

我发现了以下问题:将项目插入Python .它指向Python的 bisect 模块.但是该模块似乎不支持不区分大小写.

I found this question: Insert an item into a sorted list in Python. It points towards Python's bisect module. But that module doesn't look like it can support case-insensitivity.

编辑:我测试了此处列出的几个答案.

Edit: I tested out several of the answers listed here.

  • 将项目追加到最后并对整个列表进行排序(按照原始问题的建议)是最慢的.
  • Moinuddin Quadri的答案比对整个列表进行排序要快,但由于对列表中的每个项目都运行lower(),因此答案仍然很慢.
  • Stefan Pochmann的答案比对整个列表进行排序要快一个数量级.
  • Jared Goguen的回答对于重复插入来说是最快的.但是,它第一次在每个元素上运行lower().
  • Appending the item to the end and sorting the whole list (as proposed in the original question) was the slowest.
  • Moinuddin Quadri's answer was faster than sorting the whole list, but it was still pretty slow, due to running lower() on every item on the list.
  • Stefan Pochmann's answer was an order of magnitude faster than sorting the whole list.
  • Jared Goguen's answer was the fastest for repeated insertions. The first time, though, it runs lower() on every element.

这是一个接听电话,接受一个答案.最后,我接受了Stefan Pochmann的回答,因为它是一次性插入的最佳选择,并且访问结果列表不需要访问成员变量.但是,用例会有所不同,因此请务必检查所有答案.

It was a close call to accept an answer. In the end, I went with Stefan Pochmann's answer because it was the best for a one-time insertion, and accessing the resulting list does not require accessing a member variable. However, use cases vary, so be sure to examine all the answers.

推荐答案

这是再次练习二进制搜索的好机会(或者只是复制并粘贴并修改bisect.insort,这正是我所做的):

It's a good opportunity to practice binary search again (or just copy&paste&modify bisect.insort, which is what I did):

def insort_case_insensitive(a, x):
    key = x.lower()
    lo, hi = 0, len(myList)
    while lo < hi:
        mid = (lo + hi) // 2
        if key < a[mid].lower():
            hi = mid
        else:
            lo = mid + 1
    a.insert(lo, x)

演示:

myList = ['a', 'b', 'c', 'd', 'e']
for x in 'A', 'B', 'C', 'D', 'E':
    insort_case_insensitive(myList, x)
print myList

打印:

['a', 'A', 'b', 'B', 'c', 'C', 'd', 'D', 'e', 'E']

它是O(n),就像append + sort一样,只是因为结尾处是a.insert(lo, x).这是非常简单的,并且是用C语言完成的,因此它非常快.二进制搜索当然只需要O(log n)步,所以这也非常快. append + sort方法将在所有元素上调用.lower()并将它们进行比较,这两者都要慢得多.由于在所有元素上调用.lower(),@ MoinuddinQuadri的第一个解决方案也要慢得多.

It's O(n), just like append+sort would be, but only because of the a.insert(lo, x) at the end. Which is dead-simple and done in C, so it's super fast. The binary search of course only takes O(log n) steps, so that's super fast as well. The append+sort way would call .lower() on all elements and compare them, both of which is much slower. @MoinuddinQuadri's first solution is also much slower because of calling .lower() on all elements.

请参阅我的其他答案以进行基准比较.

See my other answer for a benchmarking comparison.

这篇关于将项目插入不区分大小写的Python排序列表中的文章就介绍到这了,希望我们推荐的答案对大家有所帮助,也希望大家多多支持IT屋!

查看全文
登录 关闭
扫码关注1秒登录
发送“验证码”获取 | 15天全站免登陆