查找向量矩阵的最频繁行或模式 - Python/NumPy [英] Find most frequent row or mode of a matrix of vectors - Python / NumPy

查看:54
本文介绍了查找向量矩阵的最频繁行或模式 - Python/NumPy的处理方法,对大家解决问题具有一定的参考价值,需要的朋友们下面随着小编来一起学习吧!

问题描述

我有一个形状为 (?,n) 的 numpy 数组,它表示一个由 n 维向量组成的向量.

I have a numpy array of shape (?,n) that represents a vector of n-dimensional vectors.

我想找到最频繁的行.

到目前为止,似乎最好的方法是遍历所有条目并存储一个计数,但 numpy 或 scipy 没有内置的东西来执行此任务似乎很可笑.

So far it seems that the best way is to just iterate over all the entries and store a count, but it seems obscene that numpy or scipy wouldn't have something builtin to perform this task.

推荐答案

这里有一个使用 NumPy 视图 的方法,应该非常有效 -

Here's an approach using NumPy views, which should be pretty efficient -

def mode_rows(a):
    a = np.ascontiguousarray(a)
    void_dt = np.dtype((np.void, a.dtype.itemsize * np.prod(a.shape[1:])))
    _,ids, count = np.unique(a.view(void_dt).ravel(), \
                                return_index=1,return_counts=1)
    largest_count_id = ids[count.argmax()]
    most_frequent_row = a[largest_count_id]
    return most_frequent_row

样品运行 -

In [45]: # Let's have a random arrayb with three rows(2,4,8) and two rows(1,7)
    ...: # being duplicated. Thus, the most freequent row must be 2 here.
    ...: a = np.random.randint(0,9,(9,5))
    ...: a[4] = a[8]
    ...: a[2] = a[4]
    ...: 
    ...: a[1] = a[7]
    ...: 

In [46]: a
Out[46]: 
array([[8, 8, 7, 0, 7],
       [7, 8, 2, 6, 1],
       [2, 2, 5, 7, 6],
       [6, 5, 8, 8, 5],
       [2, 2, 5, 7, 6],
       [5, 7, 3, 6, 3],
       [2, 8, 7, 2, 0],
       [7, 8, 2, 6, 1],
       [2, 2, 5, 7, 6]])

In [47]: mode_rows(a)
Out[47]: array([2, 2, 5, 7, 6])

这篇关于查找向量矩阵的最频繁行或模式 - Python/NumPy的文章就介绍到这了,希望我们推荐的答案对大家有所帮助,也希望大家多多支持IT屋!

查看全文
登录 关闭
扫码关注1秒登录
发送“验证码”获取 | 15天全站免登陆