如何在groupby.agg()函数内合并';'.join和lambda x:x.tolist()? [英] How to combine ';'.join and lambda x: x.tolist() inside an groupby.agg() function?

查看:661
本文介绍了如何在groupby.agg()函数内合并';'.join和lambda x:x.tolist()?的处理方法,对大家解决问题具有一定的参考价值,需要的朋友们下面随着小编来一起学习吧!

问题描述

下面更新!

我正在尝试对ID列表及其连接的唯一Name_ID进行合并和排序,并用分号分隔. 例如:

I am trying to merge and sort a list of IDs and their connected unique Name_ID, separated by semicolons. For example:

Name_ID Adress_ID            Name_ID Adress_ID
Name1   5875383              Name1   5875383; 5901847
Name1   5901847              Name2   5285200
Name2   5285200      to      Name3   2342345; 6463736
Name3   2342345
Name3   6463736

这是我当前的代码:

origin_file_path = Path("Folder/table.xlsx")
dest_file_path = Path("Folder/table_sorted.xlsx")

table = pd.read_excel(origin_file_path)
df1 = pd.DataFrame(table)

df1 = df1.groupby('Name_ID').agg(lambda x: x.tolist())

df1.to_excel(dest_file_path, sheet_name="Adress_IDs")

但是它像这样将其导出到excel文件中:

But it exports it like this to the excel file:

Name_ID Adress_ID
Name1   [5875383, 5901847]

有人可以告诉我最好的方法是摆脱列表格式,并用分号(而不是逗号)分隔吗?

Can someone tell me what the best way would be to get rid of the list format and separate by semicolons instead of commas?

更新:

用户 Jezrael 为此链接了我

The user Jezrael linked me this thread. But I can't seem to be able to combine ';'.join with lambda x: x.tolist().

df1 = df1.groupby('Kartenname').agg(';'.join, lambda x: x.tolist())

产生TypeError:join()仅接受一个参数(给定2个参数)

Produces TypeError: join() takes exactly one argument (2 given)

df1 = df1.groupby('Kartenname').agg(lambda x: x.tolist(), ';'.join)

产生TypeError:()接受1个位置参数,但给出了2个.

Produces TypeError: () takes 1 positional argument but 2 were given.

我也尝试了其他组合,但似乎都无法正常执行.摆脱lambda函数不是一种选择,因为它只会粘贴Name_ID Adress_ID一千次,而不是正确的Name和ID.

I also tried other Combinations but none seem to even execute properly. Getting rid of the lambda function isn't an option because then it just pastes Name_ID Adress_ID a thousand times instead of the correct Name and correct IDs.

推荐答案

您可以将具有新列名和聚合函数的函数传递给agg函数元组:

You can pass to agg function tuples with new column names with aggregate functions:

df['Adress_ID'] = df['Adress_ID'].astype(str)
df1 = df.groupby('Name_ID')['Adress_ID'].agg([('a', ';'.join),
                                              ('b',  lambda x: x.tolist())])

print (df1)
                       a                   b
Name_ID                                     
Name1    5875383;5901847  [5875383, 5901847]
Name2            5285200           [5285200]
Name3    2342345;6463736  [2342345, 6463736]

如果仅传递列表中的聚合函数(无元组),则会获得默认的列名称:

If pass only aggregate functions in list (no tuples) get default columns names:

df2 = df.groupby('Name_ID')['Adress_ID'].agg([ ';'.join,lambda x: x.tolist()])

print (df2)
                    join          <lambda_0>
Name_ID                                     
Name1    5875383;5901847  [5875383, 5901847]
Name2            5285200           [5285200]
Name3    2342345;6463736  [2342345, 6463736]

这篇关于如何在groupby.agg()函数内合并';'.join和lambda x:x.tolist()?的文章就介绍到这了,希望我们推荐的答案对大家有所帮助,也希望大家多多支持IT屋!

查看全文
登录 关闭
扫码关注1秒登录
发送“验证码”获取 | 15天全站免登陆