使用numpy和pandas进行矩阵搜索操作 [英] Matrix search operation using numpy and pandas

查看:362
本文介绍了使用numpy和pandas进行矩阵搜索操作的处理方法,对大家解决问题具有一定的参考价值,需要的朋友们下面随着小编来一起学习吧!

问题描述

我正在尝试从一个矩阵中搜索并将该值替换为第二个矩阵.

I am trying to search from one matrix and replace that value on 2nd matrix.

ds1 = [[ 4, 13,  6,  9],
       [ 7, 12,  5,  7],
       [ 7,  0,  4, 22],
       [ 9,  8, 12,  0]]
ds2 = [[ 4,  1],
       [ 5,  3],
       [ 6,  1],
       [ 7,  2],
       [ 8,  2],
       [ 9,  3],
       [12,  1],
       [13,  2],
       [22,  3]]
output = [[1, 2, 1, 3],
       [2, 1, 3, 2],
       [2, 0, 1, 3],
       [3, 2, 1, 0]]

这是代码:

out = ds1.copy()
_,C = np.where(ds1.ravel()[:,None] == ds2[:,0])
newvals = ds2[C,1]
valid = np.in1d(ds1.ravel(),ds2[:,0])
out.ravel()[valid] = newvals

output是用ds1中的索引val替换ds2键值的结果. 我对实际矩阵值做了同样的事情

output is the result of replacing ds2 key value by it's index val in ds1. Same thing I did with my actual matrix values

ds1 = pd.read_table('https://gist.githubusercontent.com/karimkhanp/9527bad750fbe75e072c/raw/ds1', sep=' ', header=None)
ds2 = pd.read_table('https://gist.githubusercontent.com/karimkhanp/1692f1f76718c35e939f/raw/6f6b348ab0879b702e1c3c5e362e9d2062e9e9bc/ds2', header=None, sep=' ')

所以我得到

   _,C = np.where(ds1.ravel()[:,None] == ds2[:,0])
  File "/usr/local/lib/python2.7/dist-packages/pandas/core/generic.py", line 1947, in __getattr__
    (type(self).__name__, name))
AttributeError: 'DataFrame' object has no attribute 'ravel'

我也尝试通过在numpy数组中进行转换

I also tried by converting in numpy array

ds1 = np.array(ds1)
ds2 = np.array(ds2)
_,C = np.where(ds1.values.ravel()[:,None] == ds2.values[:,0])

所以它给出了:

AttributeError Traceback (most recent call last)<ipython-input-39-6a80d7cd7f81> in <module>()----> 1 _,C = np.where(ds1.values.ravel()[:,None] == ds2.values[:,0])AttributeError: 'numpy.ndarray' object has no attribute 'values'

任何建议或帮助,不胜感激

Any suggestion or help much appreciated

推荐答案

values是熊猫DataFrame的成员,而不是numpy ndarray.因此,在第二种方法中,请勿将ds转换为numpy数组.只需删除这两行 ds1 = np.array(ds1) ds2 = np.array(ds2) _,C = np.where(ds1.values.ravel()[:,None] == ds2.values[:,0]) 应该可以.

values is a member of pandas DataFrame instead of numpy ndarray. Thus, in your second method, don't convert ds to numpy array. Just remove these two lines ds1 = np.array(ds1) ds2 = np.array(ds2) and _,C = np.where(ds1.values.ravel()[:,None] == ds2.values[:,0]) should work.

-----------------这是我机器上的测试-------------------

----------------- This is a test on my machine -------------------

我的剧本是

#!/usr/bin/env python
# -*- coding: utf-8 -*-

import sys
import pandas as pd
import numpy as np

ds1 = pd.read_table('https://gist.githubusercontent.com/karimkhanp/9527bad750fbe75e072c/raw/ds1', sep=' ', header=None)
ds2 = pd.read_table('https://gist.githubusercontent.com/karimkhanp/1692f1f76718c35e939f/raw/6f6b348ab0879b702e1c3c5e362e9d2062e9e9bc/ds2', header=None, sep=' ')

print ds1.shape, ds2.shape
_,C = np.where(ds1.values.ravel()[:,None] == ds2.values[:,0])
print C

,输出为

(1000, 1001) (4000, 2)
[  10   35   60 ..., 3869 3938 3987]

我的环境是cygwin和python 2.7.9.

My environment is cygwin and python 2.7.9.

这篇关于使用numpy和pandas进行矩阵搜索操作的文章就介绍到这了,希望我们推荐的答案对大家有所帮助,也希望大家多多支持IT屋!

查看全文
登录 关闭
扫码关注1秒登录
发送“验证码”获取 | 15天全站免登陆