如何识别“钥匙"?元组/3项元组列表? [英] How to identify "keys" of a tuple/list of 3-item tuples?

查看:96
本文介绍了如何识别“钥匙"?元组/3项元组列表?的处理方法,对大家解决问题具有一定的参考价值,需要的朋友们下面随着小编来一起学习吧!

问题描述

因此给出收入值表:

要注意的一个关键点(也是我问题的核心)是品牌名称将几乎始终(但不总是)包含相应的产品名称.如果是上次输入香蕉,则不会.

A key point to note (and the core of my question) is the the brand name will almost always, but not always, contain the corresponding product name. In the case of last Banana entry, it doesn't.

我将提取品牌<->>收入对中的dict,首先对具有多个条目的品牌进行会计处理,然后使用

I will extract a dict of Brand<->Revenue pairs, fist accounting for those brands that have multiple entries, and summing in those cases, using the approach described here. So:

revenuePerBrandDict = {}
brandRevenueTuples = []
i=0
for brand in ourTab.columns[1][1:-1]: # ignore first (zeroth) and last row
    brandRevenueTuples.append((campaign.value, round(ourTab.columns[3][i].value,2)))
    i+=1
for key, value in brandRevenueTuples:
        revenuePerBrandDict[key] = revenuePerBrandDict.get(key, 0) + value

然后我将本字典中的键和值交叉引用到其中的每个字典(香蕉支出的字典,猕猴桃支出的字典等),然后减去收入支出,每一项.这些字典将从香蕉表,猕猴桃表等中提取,如下所示:

I will then cross-reference the keys and values in this dict to each dict in (dict of banana expenses, dict of kiwi expenses etc.), and subtract expenses from revenue, item per item. These dicts will be extracted from banana table, kiwi table etc. that look like this:

如果品牌名称​​始终在收入表中包含产品名称,那么为了编译适当的收入值集合以与香蕉"费用指令进行比较,例如,我只提取所有名称中包含香蕉"的品牌,以及要匹配香蕉费用"字典中的键,请对其值进行提取.

If the brand name always contained the product name in the revenue table, then in order to compile an appropriate collection of revenue values for comparison with the Banana expenses dict, for example, I would just extract all those brands whose name contained 'Banana', and for matching keys in the Banana expenses dict, perform the extraction on their values.

但事实并非如此,因此我需要另一种方式来了解,在收入dict中,"OtherBrand"是香蕉.(在 Banana dict中,我已经知道它是香蕉,因为它来自香蕉"表).除了提取dict品牌->收入对之外,我还可以提取(产品,(品牌,收入)的元组)的列表或元组,现在我们有了 Product 列.但是,由于元组没有键的概念,因此我如何遍历这个新集合,以所需的方式提取每个元组的收入(即认识到 OtherBrand 是香蕉等). )

But it doesn't, so I need another way of knowing that in the Revenue dict, 'OtherBrand' is a Banana. (In the Banana dict, I already know it is a Banana, because it came from the Banana table). Instead of extracting a dict of Brand<->Revenue pairs, I could extract a list or tuple of (tuples of (Product, Brand, Revenue)), and now we have the additional information provided by the Product column. But since a tuple doesn't have the concept of a key, how do I iterate across this new collection, extracting revenue per tuple in the desired way (i.e. with recognition that that OtherBrand is a Banana etc.)

推荐答案

您可以使用水果作为键并对品牌进行分组:

You can use the fruits as keys and group the brands:

from collections import defaultdict
import csv

with open("in.csv") as f:
    r = csv.reader(f)
    next(r) # skip header
    # fruite will be keys, values will be dicts
    # with brands as keys  and running totals for rev as values
    d = defaultdict(lambda: defaultdict(int))
    for fruit, brand, rev in r:
        d[fruit][brand] += float(rev)

使用您的输入输出:

from pprint import pprint as pp

pp(dict(d))
{'Apple': defaultdict(<type 'int'>, {'CrunchApple': 1.7}),
 'Banana': defaultdict(<type 'int'>, {'BananaBrand': 4.0,   'OtherBrand': 3.2}),
 'Kiwi': defaultdict(<type 'int'>, {'NZKiwi': 1.2}),
 'Pear': defaultdict(<type 'int'>, {'PearShaped': 6.2})

然后您可以使用键减去费用.

You can then subtract the expenses using the keys.

使用 pandas ,生活变得更加轻松,您可以分组和求和:

Using pandas life is even easier you can groupby and sum:

import pandas as pd

df = pd.read_csv("in.csv")

print(df.groupby(("A","B")).sum())

输出:

A      B               
Apple  CrunchApple  1.7
Banana BananaBrand  4.0
       OtherBrand   3.2
Kiwi   NZKiwi       1.2
Pear   PearShaped   6.2

或按水果和品牌分组:

groups = df.groupby(["A","B"])

print(groups.get_group(('Banana', 'OtherBrand')))

print(groups.get_group(('Banana', 'BananaBrand')))

这篇关于如何识别“钥匙"?元组/3项元组列表?的文章就介绍到这了,希望我们推荐的答案对大家有所帮助,也希望大家多多支持IT屋!

查看全文
登录 关闭
扫码关注1秒登录
发送“验证码”获取 | 15天全站免登陆