使用数据框仅从字典中选择所需的键 [英] selecting only required keys from a dictionary using a dataframe

查看:78
本文介绍了使用数据框仅从字典中选择所需的键的处理方法,对大家解决问题具有一定的参考价值,需要的朋友们下面随着小编来一起学习吧!

问题描述

我有一个包含产品及其状态的数据框,如下所示:

I have a data frame with products and their status like below

DataFrame:

DataFrame:

products    status
11  sale
22  sale
33  notsale
44  notsale
55  notsale
66  removed
77  removed
88  notsale
99  sale
222 sale
333 removed
444 removed
555 notsale

我还有一个用户数据作为字典,其中包含用户及其感兴趣的产品列表.

I also have a users data as a dictionary with a user and the list of products they are interested in.

{1: [11,22,33,555,33], 2:[33,66,77,88,99],3:[11,88,99,222,333,555],4:[333,33,444,44],5:[333,444,22,33,44,55,66]}

我需要做的是,删除状态为removed的产品以及用户对上述词典感兴趣的重复内容.

what I need to do is, remove the products with status as removed as well as duplicates from the users interest in the above dictionary.

预期输出:

{1: [11,22,33,555,], 2: [33, 88,99], 3:[11,88,99,222,555], 4: [33, 44], 5: [22, 33,44,55]}

推荐答案

boolean indexing 值和removed,然后在dict comprehension中将值转换为set以获取唯一值,然后删除a的值:

First filter by boolean indexing values with removed and then in dict comprehension convert values to set for unique values and then remove values of a:

a = df.loc[df['status'] == 'removed', 'products'].tolist()
print (a)
[66, 77, 333, 444]

d = {1: [11,22,33,555,33], 2:[33,66,77,88,99], 
     3:[11,88,99,222,333,555], 4:[333,33,444,44],5:[333,444,22,33,44,55,66]}

d1 = {k: list(set(v)-set(a)) for k, v in d.items()}
print (d1)
{1: [33, 11, 22, 555], 2: [88, 33, 99], 
 3: [11, 555, 99, 222, 88], 4: [33, 44], 5: [33, 44, 22, 55]}

要使用多个关键字过滤,请使用 isin :

For filter by multiple keywors use isin:

a = df.loc[df['status'].isin(['removed', 'notsale']), 'products'].tolist()

这篇关于使用数据框仅从字典中选择所需的键的文章就介绍到这了,希望我们推荐的答案对大家有所帮助,也希望大家多多支持IT屋!

查看全文
登录 关闭
扫码关注1秒登录
发送“验证码”获取 | 15天全站免登陆