根据两个键过滤字典列表 [英] filter a list of dictionary based on two keys

查看:77
本文介绍了根据两个键过滤字典列表的处理方法,对大家解决问题具有一定的参考价值,需要的朋友们下面随着小编来一起学习吧!

问题描述

with open('test.csv') as f:
    list_of_dicts = [{k:v for k, v in row.items()} for row in csv.DictReader(f, skipinitialspace=True)]

您好,我有一个要保存在词典列表中的csv文件,我想根据商家1的价格"在ASIN上过滤其输出(如果找到则删除重复项)我想获得较低的价格,而不是全部他们有重复项,即删除重复项(保持商家1价格最低的重复项),并保留非重复项(在新列表中),这是列表的示例

Hello,I have csv file which I make to a list of dictionaries,I want to filter its output on ASIN (remove duplicate if found)based on "Merchant 1 Price" I want to get the lower price, not all of them have duplicates i.e remove duplicates (keep the one with the lowest merchant 1 price),and keep the non duplicates (in a new list), here is a sample of list

{'Product Name': 'NFL Buffalo Bills Bedding Set, Twin', 'Amazon Price': '84.99', 'ASIN': 'B004B3M5UU', 'Merchant_1': 'Homedepot', 'Merchant_1_Price': '72.65', 'Merchant_1_Stock': 'False', 'Merchant_1_Link': 'https://www.homedepot.com/p/Jaguars-2-PIECE-Draft-Multi-Twin-Comforter-Set-1NFL862000014RET/303181069', 'Amazon Image': '=IMAGE("{temp}",4,100,100)', 'Merchant_1_Image': '=IMAGE("{temp}",4,100,100)'}
{'Product Name': 'NFL Buffalo Bills Bedding Set, Twin', 'Amazon Price': '84.99', 'ASIN': 'B004B3M5UU', 'Merchant_1': 'Overstock', 'Merchant_1_Price': '61.64', 'Merchant_1_Stock': 'False', 'Merchant_1_Link': 'https://www.overstock.com/Bedding-Bath/The-Northwest-Company-NFL-Buffalo-Bills-Draft-Twin-2-piece-Comforter-Set/13330480/product.html', 'Amazon Image': '=IMAGE("{temp}",4,100,100)', 'Merchant_1_Image': '=IMAGE("{temp}",4,100,100)'}
{'Product Name': 'EGO Power+ HT2400 24-Inch 56-Volt Lithium-ion Cordless Hedge Trimmer - Battery and Charger Not Included', 'Amazon Price': '129.0', 'ASIN': 'B00N0A4S1O', 'Merchant_1': 'Homedepot', 'Merchant_1_Price': '129.00', 'Merchant_1_Stock': 'True', 'Merchant_1_Link': 'https://www.homedepot.com/p/EGO-24-in-56-Volt-Lithium-Ion-Cordless-Hedge-Trimmer-Battery-and-Charger-Not-Included-HT2400/205163108', 'Amazon Image': '=IMAGE("{temp}",4,100,100)', 'Merchant_1_Image': '=IMAGE("{temp}",4,100,100)'}

我尝试了两个for循环,但似乎找不到正确的代码逻辑.

I tried plenty of two for loops but I can't seem to find the correct code logic.

感谢您的帮助

推荐答案

对字典重复数据进行重复数据删除的最简单方法是构建一个由唯一字段作为键的字典,本例中为'ASIN'.找到重复的副本后,可以选择带有'Merchant_1_Price'字段下方的副本:

The easiest way to deduplicate your list of dicts is to build a dictionary keyed by the unique field, which in this case is 'ASIN'. When you find a duplicate, you can select the one with the lower 'Merchant_1_Price' field:

by_asin = {}
for item in list_of_dicts:
    asin = item['ASIN']
    if (
        asin not in by_asin or
        float(item['Merchant_1_Price']) < float(by_asin[asin]['Merchant_1_Price'])
    ):
        by_asin[asin] = item

deduplicated_list_of_dicts = list(by_asin.values())

在循环中,由于要多次使用,因此我们首先要从当前项目中提取asin.然后,我们检查ASIN是否尚未在by_asin词典中,或者是否在其中,请检查新项目的价格是否低于旧项目的价格.在任何一种情况下,我们都将新项目放入by_asin字典中(如果有的话,将替换先前的值).

In the loop, we're first extracting the asin from the current item since we're going to use it several times. Then we check if that ASIN is either not yet in the by_asin dictionary, or if it is in there, we check if the price on the new item is lower than the price of the old item. In either of those cases, we put the new item into the by_asin dictionary (replacing the previous value, if there was one).

这篇关于根据两个键过滤字典列表的文章就介绍到这了,希望我们推荐的答案对大家有所帮助,也希望大家多多支持IT屋!

查看全文
登录 关闭
扫码关注1秒登录
发送“验证码”获取 | 15天全站免登陆