将Pandas DataFrame.groupby()字典到具有多列的值的字典中 [英] Pandas DataFrame.groupby() to dictionary with multiple columns for value
问题描述
type(Table)
pandas.core.frame.DataFrame
Table
======= ======= =======
Column1 Column2 Column3
0 23 1
1 5 2
1 2 3
1 19 5
2 56 1
2 22 2
3 2 4
3 14 5
4 59 1
5 44 1
5 1 2
5 87 3
对于熟悉熊猫的任何人,我将如何使用.groupby()
方法构建多值字典?
For anyone familliar with pandas how would I build a multivalue dictionary with the .groupby()
method?
我希望输出类似于以下格式:
I would like an output to resemble this format:
{
0: [(23,1)]
1: [(5, 2), (2, 3), (19, 5)]
# etc...
}
其中,Col1
值表示为键,而相应的Col2
和Col3
是打包为每个Col1
键的数组的元组.
where Col1
values are represented as keys and the corresponding Col2
and Col3
are tuples packed into an array for each Col1
key.
我的语法仅用于将一列合并到.groupby()
:
My syntax works for pooling only one column into the .groupby()
:
Table.groupby('Column1')['Column2'].apply(list).to_dict()
# Result as expected
{
0: [23],
1: [5, 2, 19],
2: [56, 22],
3: [2, 14],
4: [59],
5: [44, 1, 87]
}
但是,为索引指定多个值会导致返回该值的列名:
However specifying multiple values for the indices results in returning column names for the value :
Table.groupby('Column1')[('Column2', 'Column3')].apply(list).to_dict()
# Result has column namespace as array value
{
0: ['Column2', 'Column3'],
1: ['Column2', 'Column3'],
2: ['Column2', 'Column3'],
3: ['Column2', 'Column3'],
4: ['Column2', 'Column3'],
5: ['Column2', 'Column3']
}
如何返回值数组中的元组列表?
How would I return a list of tuples in the value array?
推荐答案
自定义您在apply
中使用的函数,以便它返回每个组的列表列表:
Customize the function you use in apply
so it returns a list of lists for each group:
df.groupby('Column1')[['Column2', 'Column3']].apply(lambda g: g.values.tolist()).to_dict()
# {0: [[23, 1]],
# 1: [[5, 2], [2, 3], [19, 5]],
# 2: [[56, 1], [22, 2]],
# 3: [[2, 4], [14, 5]],
# 4: [[59, 1]],
# 5: [[44, 1], [1, 2], [87, 3]]}
如果您需要显式的元组列表,请使用list(map(tuple, ...))
进行转换:
If you need a list of tuples explicitly, use list(map(tuple, ...))
to convert:
df.groupby('Column1')[['Column2', 'Column3']].apply(lambda g: list(map(tuple, g.values.tolist()))).to_dict()
# {0: [(23, 1)],
# 1: [(5, 2), (2, 3), (19, 5)],
# 2: [(56, 1), (22, 2)],
# 3: [(2, 4), (14, 5)],
# 4: [(59, 1)],
# 5: [(44, 1), (1, 2), (87, 3)]}
这篇关于将Pandas DataFrame.groupby()字典到具有多列的值的字典中的文章就介绍到这了,希望我们推荐的答案对大家有所帮助,也希望大家多多支持IT屋!