根据两列的值选择 pandas 框架行 [英] Select pandas frame rows based on two columns' values

查看:42
本文介绍了根据两列的值选择 pandas 框架行的处理方法,对大家解决问题具有一定的参考价值,需要的朋友们下面随着小编来一起学习吧!

问题描述

我希望基于两个列值选择一些特定的行.例如:

I wish to select some specific rows based on two column values. For example:

d = {'user' : [1., 2., 3., 4] ,'item' : [5., 6., 7., 8.],'f1' : [9., 16., 17., 18.], 'f2':[4,5,6,5], 'f3':[4,5,5,8]}
df = pd.DataFrame(d)
print df

Out:
   f1  f2  f3  item  user
0   9   4   4     5     1
1  16   5   5     6     2
2  17   6   5     7     3
3  18   5   8     8     4

我想基于'user'和'item'的值选择行.给定一个二维numpy数组,用于存储[user,item]值对:

I want to select the rows based on the values of 'user' and 'item'. Given an 2d numpy array which stores the [user, item] values pairs:

samples = np.array([[1,5],[3,7],[3,7],[2,6]]) 
Out: 
array([[1, 5],
       [3, 7],
       [3, 7],
       [2, 6]])

那么预期的输出是:

    Out:
   f1  f2  f3  item  user
0   9   4   4     5     1
2  17   6   5     7     3
2  17   6   5     7     3
1  16   5   5     6     2

然后,我的最终目标是获得一个2d numpy数组,该数组存储除item和user之外的所有列值:

Then, my final objective is to get an 2d numpy array stores all the columns values except item and user, which is:

Out: 
array([[9, 4, 4],
       [17, 6, 5],
       [17, 6, 5],
       [16, 5, 5]])

我们可以看到,它是f1,f2,f3列的值.

As we can see, it is the values of columns f1, f2, f3.

我该怎么做?

推荐答案

如果将samples做成具有列useritem的DataFrame,则可以使用

If you make samples a DataFrame with columns user and item, then you can obtain the desired values with an inner join. By default, pd.merge merges on all columns of samples and df shared in common -- in this case, that would be user and item. Hence,

result = pd.merge(samples, df, how='inner')

收益

   user  item  f1  f2  f3
0     1     5   9   4   4
1     3     7  17   6   5
2     3     7  17   6   5
3     2     6  16   5   5


import numpy as np
import pandas as pd

d = {'user' : [1., 2., 3., 4] ,'item' : [5., 6., 7., 8.],'f1' : [9., 16., 17., 18.], 'f2':[4,5,6,5], 'f3':[4,5,5,8]}
df = pd.DataFrame(d)
samples = np.array([[1,5],[3,7],[3,7],[2,6]]) 
samples = pd.DataFrame(samples, columns=['user', 'item'])

result = pd.merge(samples, df, how='inner')
result = result[['f1', 'f2', 'f3']]
result = result.values
print(result)

收益

[[  9.   4.   4.]
 [ 17.   6.   5.]
 [ 17.   6.   5.]
 [ 16.   5.   5.]]

这篇关于根据两列的值选择 pandas 框架行的文章就介绍到这了,希望我们推荐的答案对大家有所帮助,也希望大家多多支持IT屋!

查看全文
登录 关闭
扫码关注1秒登录
发送“验证码”获取 | 15天全站免登陆