将项目插入不区分大小写的Python排序列表中 [英] Insert item into case-insensitive sorted list in Python
问题描述
我有一个已经按不区分大小写顺序排序的字符串列表.我想在列表中插入一个新字符串.一种方法是添加项目,然后对列表进行排序,如下所示:
I have a list of strings that is already sorted in case-insensitive order. I would like to insert a new string into the list. One way to do this is to append the item and then sort the list, like so:
myList.append('Something')
myList.sort(key=lambda s: s.lower())
但是我想知道是否有一种方法可以将项目插入正确的位置,而无需再次对整个项目进行排序.
But I was wondering if there is a way to just insert the item in the correct position without sorting the whole thing again.
我发现了以下问题:将项目插入Python .它指向Python的 bisect 模块.但是该模块似乎不支持不区分大小写.
I found this question: Insert an item into a sorted list in Python. It points towards Python's bisect module. But that module doesn't look like it can support case-insensitivity.
编辑:我测试了此处列出的几个答案.
Edit: I tested out several of the answers listed here.
- 将项目追加到最后并对整个列表进行排序(按照原始问题的建议)是最慢的.
- Moinuddin Quadri的答案比对整个列表进行排序要快,但由于对列表中的每个项目都运行
lower()
,因此答案仍然很慢. - Stefan Pochmann的答案比对整个列表进行排序要快一个数量级.
- Jared Goguen的回答对于重复插入来说是最快的.但是,它第一次在每个元素上运行
lower()
.
- Appending the item to the end and sorting the whole list (as proposed in the original question) was the slowest.
- Moinuddin Quadri's answer was faster than sorting the whole list, but it was still pretty slow, due to running
lower()
on every item on the list. - Stefan Pochmann's answer was an order of magnitude faster than sorting the whole list.
- Jared Goguen's answer was the fastest for repeated insertions. The first time, though, it runs
lower()
on every element.
这是一个接听电话,接受一个答案.最后,我接受了Stefan Pochmann的回答,因为它是一次性插入的最佳选择,并且访问结果列表不需要访问成员变量.但是,用例会有所不同,因此请务必检查所有答案.
It was a close call to accept an answer. In the end, I went with Stefan Pochmann's answer because it was the best for a one-time insertion, and accessing the resulting list does not require accessing a member variable. However, use cases vary, so be sure to examine all the answers.
推荐答案
这是再次练习二进制搜索的好机会(或者只是复制并粘贴并修改bisect.insort
,这正是我所做的):
It's a good opportunity to practice binary search again (or just copy&paste&modify bisect.insort
, which is what I did):
def insort_case_insensitive(a, x):
key = x.lower()
lo, hi = 0, len(myList)
while lo < hi:
mid = (lo + hi) // 2
if key < a[mid].lower():
hi = mid
else:
lo = mid + 1
a.insert(lo, x)
演示:
myList = ['a', 'b', 'c', 'd', 'e']
for x in 'A', 'B', 'C', 'D', 'E':
insort_case_insensitive(myList, x)
print myList
打印:
['a', 'A', 'b', 'B', 'c', 'C', 'd', 'D', 'e', 'E']
它是O(n),就像append + sort一样,只是因为结尾处是a.insert(lo, x)
.这是非常简单的,并且是用C语言完成的,因此它非常快.二进制搜索当然只需要O(log n)步,所以这也非常快. append + sort方法将在所有元素上调用.lower()
并将它们进行比较,这两者都要慢得多.由于在所有元素上调用.lower()
,@ MoinuddinQuadri的第一个解决方案也要慢得多.
It's O(n), just like append+sort would be, but only because of the a.insert(lo, x)
at the end. Which is dead-simple and done in C, so it's super fast. The binary search of course only takes O(log n) steps, so that's super fast as well. The append+sort way would call .lower()
on all elements and compare them, both of which is much slower. @MoinuddinQuadri's first solution is also much slower because of calling .lower()
on all elements.
请参阅我的其他答案以进行基准比较.
See my other answer for a benchmarking comparison.
这篇关于将项目插入不区分大小写的Python排序列表中的文章就介绍到这了,希望我们推荐的答案对大家有所帮助,也希望大家多多支持IT屋!