Counter.most_common(n)如何覆盖任意顺序 [英] Counter.most_common(n) how to override arbitrary ordering
问题描述
我可以使用Counter.most_common()功能完成排名/排序,从而避免以下行: d = sorted(d.items(),key = lambda x:(-x [1] ,x [0]),reverse = False)
??
Can I accomplish a rank/sort using Counter.most_common() functionality, thus avoiding this line: d = sorted(d.items(), key=lambda x: (-x[1],x[0]), reverse=False)
??
挑战:
给定一个字符串。该字符串仅包含小写英文字母字符。您的任务是查找字符串中最常见的前三个字符。
Challenge: You are given a string.The string contains only lowercase English alphabet characters.Your task is to find the top three most common characters in the string.
输出格式:
将三个最常见的字符及其出现次数分别打印在单独的行上。按出现次数降序对输出进行排序。如果出现次数相同,则按升序对字符进行排序。
Output Format: Print the three most common characters along with their occurrence count each on a separate line. Sort output in descending order of occurrence count. If the occurrence count is the same, sort the characters in ascending order.
在完成此操作时,我使用了dict,Counter和sort以确保出现次数为同样,请按升序对字符进行排序。内置的Python sorted
功能可确保按计数顺序排序,然后按字母顺序排序。 我很好奇是否有一种方法可以覆盖 Counter.most_common()
默认的任意排序/顺序逻辑,因为它似乎忽略了结果的字典顺序挑选前3名。
In completing this I used dict, Counter, and sort in order to ensure "the occurrence count is the same, sort the characters in ascending order". The in-built Python sorted
functionality ensures ordering by count, then alphabetical. I'm curious if there is a way to override Counter.most_common()
default arbitrary sort/order logic as it seems to disregard the lexicographical order of the results when picking the top 3.
import sys
from collections import Counter
string = sys.stdin.readline().strip()
d = dict(Counter(string).most_common(3))
d = sorted(d.items(), key=lambda x: (-x[1],x[0]), reverse=False)
for letter, count in d[:3]:
print letter, count
推荐答案
是文档中明确表示 Counter.most_common()
的(平局)顺序
- 更新:PM2Ring告诉我Counter继承了dict的排序。插入顺序仅在3.6+版本中发生,并且只能在3.7中得到保证。
- 在cPython 3.6+中,它们使用原始插入顺序(请参阅底部),但不要依赖该实现,因为根据规范,它没有定义行为。如您所说,最好按照自己的意愿进行排序。
- 我在底部显示了您如何 monkey-patch
Counter.most_common
和您自己的排序功能,如下所示,但对此不满意。 (您编写的代码可能会意外地依赖它,因此在未打补丁时会中断。) - 您可以将
Counter
子类化为MyCounter
,因此您可以覆盖其most_common
。 - 真正的最好方法是编写代码和测试,这些代码和测试不依赖
most_common()的任意决胜局顺序。
code> - 我同意不应该对
most_common()
进行硬接线,并且我们应该能够通过比较键或将函数排序为__ init __()
。
- UPDATE: PM2Ring told me Counter inherits dict's ordering. The insertion order thing only happens in 3.6+, and is only guaranteed in 3.7. It's possible the doc is lagging.
- In cPython 3.6+ they fall back on original insertion order (see bottom), but don't rely on that implementation because per the spec, it's not defined behavior. Best to do your own sort, as you say, if you want totally deterministic behavior.
- I show at bottom how you can monkey-patch
Counter.most_common
with your own sort function like you show, but that's frowned on. (Code you write might accidentally rely on it and hence break when it wasn't patched.) - You could subclass
Counter
toMyCounter
so you can override itsmost_common
. Painful and not really portable. - Really the best approach is just to write code and tests that don't rely on the arbitrary tiebreaker order from
most_common()
- I agree that
most_common()
should not have been hardwired and we should be able to pass a comparison key or sort function into__init__()
.
猴子修补 Counter.most_common()
:
def patched_most_common(self):
return sorted(self.items(), key=lambda x: (-x[1],x[0]))
collections.Counter.most_common = patched_most_common
collections.Counter('ccbaab')
Counter({'a': 2, 'b': 2, 'c': 2})
证明在cPython 3.7中,任意顺序是插入顺序(每个字符的第一次插入):
Demonstrating that in cPython 3.7, the arbitrary order is order of insertion (first insertion of each character):
Counter('abccba').most_common()
[('a', 2), ('b', 2), ('c', 2)]
Counter('ccbaab').most_common()
[('c', 2), ('b', 2), ('a', 2)]
这篇关于Counter.most_common(n)如何覆盖任意顺序的文章就介绍到这了,希望我们推荐的答案对大家有所帮助,也希望大家多多支持IT屋!