根据多个值过滤字典列表 [英] Filtering a list of dictionaries based on multiple values

查看:53
本文介绍了根据多个值过滤字典列表的处理方法,对大家解决问题具有一定的参考价值,需要的朋友们下面随着小编来一起学习吧!

问题描述

我有一个要基于多个条件过滤的词典列表.列表的缩短版本如下所示:

I have a list of dictionaries that I would like to filter based on multiple criteria. A shortened version of the list looks like so:

orders = [{"name": "v", "price": 123, "location": "Mars"}, 
          {"name": "x", "price": 223, "location": "Mars"}, 
          {"name": "x", "price": 124, "location": "Mars"}, 
          {"name": "y", "price": 456, "location": "Mars"}, 
          {"name": "z", "price": 123, "location": "Mars"}, 
          {"name": "z", "price": 5623, "location": "Mars"}]

我希望最终得到一个列表,该列表包含具有相同名称"键的每本词典价格最低的词典.例如,上面的内容将变为:

I am looking to end up with a list that contains the dictionaries with the lowest price for each dictionary with the same "name" key. For example, the above would become:

minimums = [{"name": "v", "price": 123, "location": "Mars"},
            {"name": "x", "price": 124, "location": "Mars"},
            {"name": "y", "price": 456, "location": "Mars"},
            {"name": "z", "price": 123, "location": "Mars"}]

我通过嵌套if语句和for循环来实现此目的,但是我希望有一种更"Pythonic"的实现方式.

I have accomplished this with an abomination of nested if-statements and for-loops, however I was hoping there was a more "Pythonic" way of achieving things.

重用相同列表或创建新列表都是可以的.

Either reusing the same list or creating a new one is fine.

谢谢您的帮助.

谢谢您的回答,我尝试使用以下代码计时每个人

Thank you for the answers, I tried timing each of them with the following code

print("Number of dictionaries in orders: " + str(len(orders)))

t0 = time.time()
sorted_orders = sorted(orders, key=lambda i: i["name"])
t1 = time.time()
sorting_time = (t1 - t0)

t0 = time.time()
listcomp_wikiben = [x for x in orders if all(x["price"] <= y["price"] for y  in orders if x["name"] == y["name"])]
t1 = time.time()
print("listcomp_wikiben: " + str(t1 - t0))

t0 = time.time()
itertools_MrGeek = [min(g[1], key=lambda x: x['price']) for g in groupby(sorted_orders, lambda o: o['name'])]
t1 = time.time()
print("itertools_MrGeek: " + str(t1 - t0 + sorting_time))

t0 = time.time()
itertools_Cory = [min(g, key=lambda j: j["price"]) for k,g in groupby(sorted_orders, key=lambda i: i["name"])]
t1 = time.time()
print("itertools_CoryKramer: " + str(t1 - t0 + sorting_time))

t0 = time.time()
pandas_Trenton = pd.DataFrame(orders)
pandas_Trenton.groupby(['name'])['price'].min()
t1 = time.time()
print("pandas_Trenton_M: " + str(t1 - t0))

结果是:

Number of dictionaries in orders: 20867
listcomp_wikiben:     39.78123s
itertools_MrGeek:      0.01562s
itertools_CoryKramer:  0.01565s
pandas_Trenton_M:      0.29685s

推荐答案

如果您首先按名称" 对列表进行排序,则可以使用 itertools.groupby 进行分组它们,然后使用带有lambda的 min 来查找每个组中的最小价格" .

If you first sort your list by "name", you can use itertools.groupby to group them, then use min with a lambda to find the minimum "price" in each group.

>>> from itertools import groupby
>>> sorted_orders = sorted(orders, key=lambda i: i["name"])
>>> [min(g, key=lambda j: j["price"]) for k,g in groupby(sorted_orders , key=lambda i: i["name"])]
[{'name': 'v', 'price': 123, 'location': 'Mars'},
 {'name': 'x', 'price': 124, 'location': 'Mars'},
 {'name': 'y', 'price': 456, 'location': 'Mars'},
 {'name': 'z', 'price': 123, 'location': 'Mars'}]

这篇关于根据多个值过滤字典列表的文章就介绍到这了,希望我们推荐的答案对大家有所帮助,也希望大家多多支持IT屋!

查看全文
登录 关闭
扫码关注1秒登录
发送“验证码”获取 | 15天全站免登陆