pandas 在预算下找到行的所有组合 [英] Pandas find all combinations of rows under a budget

查看:87
本文介绍了 pandas 在预算下找到行的所有组合的处理方法,对大家解决问题具有一定的参考价值,需要的朋友们下面随着小编来一起学习吧!

问题描述

我正在尝试找出一种方法来确定DataFrame中所有低于预算的行组合,因此,假设我有一个像这样的数据框:

I am trying to figure out a way to determine all possible combinations of rows within a DataFrame that are below a budget, so let's say I have a dataframe like this:

data = [['Bread', 9, 'Food'], ['Shoes', 20, 'Clothes'], ['Shirt', 15, 'Clothes'], ['Milk', 5, 'Drink'], ['Cereal', 8, 'Food'], ['Chips', 10, 'Food'], ['Beer', 15, 'Drink'], ['Popcorn', 3, 'Food'], ['Ice Cream', 6, 'Food'], ['Soda', 4, 'Drink']]
df = pd.DataFrame(data, columns = ['Item', 'Price', 'Type'])
df

数据

Item       Price  Type
Bread      9      Food
Shoes      20     Clothes
Shirt      15     Clothes
Milk       5      Drink
Cereal     8      Food
Chips      10     Food
Beer       15     Drink
Popcorn    3      Food
Ice Cream  6      Food
Soda       4      Drink

我想找到我可以在特定预算下购买的每种组合,对于本示例,假设是35美元,而每种类型只得到其中一种.我想获得一个新的数据框,该数据框由适用于其自己列中每个项目的每种组合的行组成.

I want to find every combination that I could purchase for under a specific budget, let's say $35 for this example, while only getting one of each type. I'd like to get a new dataframe made up of rows for each combination that works with each item in its own column.

我试图使用itertools.product做到这一点,但这可以合并并添加列,但是我真正需要做的是根据另一列中的值合并并添加特定列.我现在有点难过.

I was trying to do it using itertools.product, but this can combine and add columns, but what I really need to do is combine and add a specific column based on values in another column. I'm a bit stumped now.

感谢您的帮助!

推荐答案

这是使用itertoolspd.concat

from itertools import chain, combinations

def powerset(iterable):
    "powerset([1,2,3]) --> () (1,) (2,) (3,) (1,2) (1,3) (2,3) (1,2,3)"
    s = list(iterable)
    return chain.from_iterable(combinations(s, r) for r in range(len(s)+1))

df_groups = pd.concat([df.reindex(l).assign(grp=n) for n, l in 
                       enumerate(powerset(df.index)) 
                       if (df.loc[l, 'Price'].sum() <= 35)])

输出具有满足$ 35条件的产品组的单个数据框:

Outputs a single dataframe with groups of product that meet $35 condition:

          Item  Price     Type  grp
0       Bread      9     Food    1
1       Shoes     20  Clothes    2
2       Shirt     15  Clothes    3
3        Milk      5    Drink    4
4      Cereal      8     Food    5
..        ...    ...      ...  ...
3        Milk      5    Drink  752
4      Cereal      8     Food  752
7     Popcorn      3     Food  752
8   Ice Cream      6     Food  752
9        Soda      4    Drink  752

这几种方法相结合可以满足$ 35的预算?

How many ways this came combined to meet $35 budget?

df_groups['grp'].nunique()

输出:

258

详细信息:

这里使用了一些技巧/方法.首先,我们使用数据框的索引使用powerset创建行或项的组.接下来,我们使用enumerate标识每个组,并使用assign在数据框中使用枚举的组号创建新列.

There are a couple of tricks/methods that are used here. First, we are using the index of the dataframe to create groups of rows or items using powerset. Next, we are using enumerate to identify each group and with assign creating a new column in a dataframe with that group number from enumerate.

df_groups = pd.concat([df.reindex(l).assign(grp=n) for n, l in 
                       enumerate(powerset(df.index)) 
                       if ((df.loc[l, 'Price'].sum() <= 35) & 
                           (df.loc[l, 'Type'].value_counts()==1).all())])

多少个小组?

df_groups['grp'].nunique()
62

为每种类型精确获取一个:

df_groups = pd.concat([df.reindex(l).assign(grp=n) for n, l in 
                       enumerate(powerset(df.index)) 
                       if ((df.loc[l, 'Price'].sum() <= 35) & 
                           (df.loc[l, 'Type'].value_counts()==1).all()&
                           (len(df.loc[l, 'Type']) == 3))])

多少个小组?

df_groups['grp'].nunique()
21

这篇关于 pandas 在预算下找到行的所有组合的文章就介绍到这了,希望我们推荐的答案对大家有所帮助,也希望大家多多支持IT屋!

查看全文
登录 关闭
扫码关注1秒登录
发送“验证码”获取 | 15天全站免登陆