将Pandas DataFrame.groupby()字典到具有多列的值的字典中 [英] Pandas DataFrame.groupby() to dictionary with multiple columns for value

查看:707
本文介绍了将Pandas DataFrame.groupby()字典到具有多列的值的字典中的处理方法,对大家解决问题具有一定的参考价值,需要的朋友们下面随着小编来一起学习吧!

问题描述

type(Table)
pandas.core.frame.DataFrame

Table
======= ======= =======
Column1 Column2 Column3
0       23      1
1       5       2
1       2       3
1       19      5
2       56      1
2       22      2
3       2       4
3       14      5
4       59      1
5       44      1
5       1       2
5       87      3

对于熟悉熊猫的任何人,我将如何使用.groupby()方法构建多值字典?

For anyone familliar with pandas how would I build a multivalue dictionary with the .groupby() method?

我希望输出类似于以下格式:

I would like an output to resemble this format:

{
    0: [(23,1)]
    1: [(5,  2), (2, 3), (19, 5)]
    # etc...
    }

其中,Col1值表示为键,而相应的Col2Col3是打包为每个Col1键的数组的元组.

where Col1 values are represented as keys and the corresponding Col2 and Col3 are tuples packed into an array for each Col1 key.

我的语法仅用于将一列合并到.groupby():

My syntax works for pooling only one column into the .groupby():

Table.groupby('Column1')['Column2'].apply(list).to_dict()
# Result as expected
{
    0: [23], 
    1: [5, 2, 19], 
    2: [56, 22], 
    3: [2, 14], 
    4: [59], 
    5: [44, 1, 87]
}

但是,为索引指定多个值会导致返回该值的列名:

However specifying multiple values for the indices results in returning column names for the value :

Table.groupby('Column1')[('Column2', 'Column3')].apply(list).to_dict()
# Result has column namespace as array value
{
    0: ['Column2', 'Column3'],
    1: ['Column2', 'Column3'],
    2: ['Column2', 'Column3'],
    3: ['Column2', 'Column3'],
    4: ['Column2', 'Column3'],
    5: ['Column2', 'Column3']
 }

如何返回值数组中的元组列表?

How would I return a list of tuples in the value array?

推荐答案

自定义您在apply中使用的函数,以便它返回每个组的列表列表:

Customize the function you use in apply so it returns a list of lists for each group:

df.groupby('Column1')[['Column2', 'Column3']].apply(lambda g: g.values.tolist()).to_dict()
# {0: [[23, 1]], 
#  1: [[5, 2], [2, 3], [19, 5]], 
#  2: [[56, 1], [22, 2]], 
#  3: [[2, 4], [14, 5]], 
#  4: [[59, 1]], 
#  5: [[44, 1], [1, 2], [87, 3]]}

如果您需要显式的元组列表,请使用list(map(tuple, ...))进行转换:

If you need a list of tuples explicitly, use list(map(tuple, ...)) to convert:

df.groupby('Column1')[['Column2', 'Column3']].apply(lambda g: list(map(tuple, g.values.tolist()))).to_dict()
# {0: [(23, 1)], 
#  1: [(5, 2), (2, 3), (19, 5)], 
#  2: [(56, 1), (22, 2)], 
#  3: [(2, 4), (14, 5)], 
#  4: [(59, 1)], 
#  5: [(44, 1), (1, 2), (87, 3)]}

这篇关于将Pandas DataFrame.groupby()字典到具有多列的值的字典中的文章就介绍到这了,希望我们推荐的答案对大家有所帮助,也希望大家多多支持IT屋!

查看全文
登录 关闭
扫码关注1秒登录
发送“验证码”获取 | 15天全站免登陆