如何按频率对NumPy数组排序? [英] How to sort a NumPy array by frequency?

查看:130
本文介绍了如何按频率对NumPy数组排序?的处理方法,对大家解决问题具有一定的参考价值,需要的朋友们下面随着小编来一起学习吧!

问题描述

我正在尝试按元素的频率对NumPy数组进行排序.因此,例如,如果存在数组[3,4,5,1,2,4,1,1,2,4],则输出将是另一个NumPy,从最常见到最不常见的元素排序(没有重复项).因此解决方案将是[4,1,2,3,5].如果两个元素具有相同的出现次数,则最先出现的元素将被放置在输出中的第一位.我已经尝试过执行此操作,但是似乎无法获得实用的答案.到目前为止,这是我的代码:

I am attempting to sort a NumPy array by frequency of elements. So for example, if there's an array [3,4,5,1,2,4,1,1,2,4], the output would be another NumPy sorted from most common to least common elements (no duplicates). So the solution would be [4,1,2,3,5]. If two elements have the same number of occurrences, the element that appears first is placed first in the output. I have tried doing this, but I can't seem to get a functional answer. Here is my code so far:

temp1 = problems[j]
indexes = np.unique(temp1, return_index = True)[1]
temp2 = temp1[np.sort(indexes)]
temp3 = np.unique(temp1, return_counts = True)[1]
temp4 = np.argsort(temp3)[::-1] + 1

其中问题[j]是一个像[3,4,5,1,2,4,1,1,2,4]之类的NumPy数组.到目前为止,temp4返回[4,1,2,5,3],但这是不正确的,因为当两个元素具有相同的出现次数时,它无法处理.

where problems[j] is a NumPy array like [3,4,5,1,2,4,1,1,2,4]. temp4 returns [4,1,2,5,3] so far but it is not correct because it can't handle when two elements have the same number of occurrences.

推荐答案

仍然适用于NumPy数组的非NumPy解决方案是使用 OrderedCounter ,后跟 sorted 和自定义功能:

A non-NumPy solution, which does still work with NumPy arrays, is to use an OrderedCounter followed by sorted with a custom function:

from collections import OrderedDict, Counter

class OrderedCounter(Counter, OrderedDict):
    pass

L = [3,4,5,1,2,4,1,1,2,4]

c = OrderedCounter(L)
keys = list(c)

res = sorted(c, key=lambda x: (-c[x], keys.index(x)))

print(res)

[4, 1, 2, 3, 5]

这篇关于如何按频率对NumPy数组排序?的文章就介绍到这了,希望我们推荐的答案对大家有所帮助,也希望大家多多支持IT屋!

查看全文
登录 关闭
扫码关注1秒登录
发送“验证码”获取 | 15天全站免登陆