在一行中的MultiIndex pd.DataFrame的每个级别中打印所有行 [英] Printing all rows in each level of MultiIndex pd.DataFrame in one row

查看:237
本文介绍了在一行中的MultiIndex pd.DataFrame的每个级别中打印所有行的处理方法,对大家解决问题具有一定的参考价值,需要的朋友们下面随着小编来一起学习吧!

问题描述

我有一个数据框,在执行 groupby()聚合

I have a dataframe which was converted into a multiIndex dataframe after doing groupby() and aggregation.

In[1]:

mydata = [['Team1', 'Player1', 'idTrip13', 133], ['Team2', 'Player333', 'idTrip10', 18373],
['Team3', 'Player22', 'idTrip12', 17338899], ['Team2', 'Player293','idTrip02', 17656], 
['Team3', 'Player20', 'idTrip11', 1883], ['Team1', 'Player1', 'idTrip19', 19393]]

df = pd.DataFrame(mydata, columns = ['team', 'player', 'trips', 'time'])
df
Out[1]:
     team    player       trips      time
0   Team1   Player1     idTrip13    133
1   Team2   Player333   idTrip10    18373
2   Team3   Player22    idTrip12    17338899
3   Team2   Player293   idTrip02    17656
4   Team3   Player20    idTrip11    1883
5   Team1   Player1     idTrip19    19393

对于团队中的每个球员,请计算旅行总数和旅行总时间。这将返回一个multiIndex数据帧。

For each player on a team, find the total number of trips and total time spent traveling. This returns a multiIndex dataframe.

player_total = df.groupby(by = ['team', 'player']).agg({'time' : 'sum', 'trips' : 'count'})

player_total
Out[4]:
                 trips  time
team    player      
Team1   Player1     2   19526
Team2   Player293   1   17656
        Player333   1   18373
Team3   Player20    1   1883
        Player22    1   17338899

所需的输出
我要打印输出,以使团队中的所有球员都在同一行。

Desired Output: I want to print the output such that all players on a team are on the same line.

Team1   Player1 : 2 trips : 19526;
Team2   Player293 : 1 : 17656; Player333 : 1 : 18373;
Team3   Player22 : 1 trip : 17338899; Player20 : 1 trip : 1883

问题的范围太广,因此我自由地将熊猫数据框的创建/聚合从输出打印中分离出来。

This question was noted as too broad so I took the liberty of splitting the pandas dataframe creation/ aggregation from the output printing.

推荐答案


  1. 使用遍历第0级(团队) groupby()

for team, df2 in player_total.groupby(level = 0):

例如,在第二次迭代中,它将返回 Team2的数据框

For example at the second iteration, it will return a dataframe for Team2:

                trips   time
team  player              
Team2 Player293     1  17656
      Player333     1  18373


  • 使用 reset_index()删除团队索引列,并将玩家索引列作为数据框的一部分。

  • Use reset_index() to drop the team index column and make the player index column as part of the dataframe.

    >>>team_df = df2.reset_index(level = 0, drop = True).reset_index()
    >>>team_df
          player  trips   time
    0  Player293     1  17656
    1  Player333     1  18373
    


  • 将该数据框转换为列表列表,以便我们迭代每个玩家。

  • Convert that dataframe into a list of lists so we can iterate through each player.

    team_df.values.tolist()
    >>>[['Player293', 1, 17656], ['Player333', 1, 18373]]
    


  • 打印时,我们必须将整数映射到字符串,并使用print函数的end参数来打印分号

  • When printing we have to map the integers to a string, and use the end parameter of the print function to print a semicolon instead of printing a new line at the end.

    >>>for player in team_df.values.tolist():
           print(': '.join(map(str, player)), end = '; ')
    >>>Player293: 1: 17656; Player333: 1: 18373; 
    


  • 完整解决方案:

    from __future__ import print_function
    
    #iterate through each team
    for team, df2 in player_total.groupby(level = 0):
        print(team, end = '\t')
        #drop the 0th level (team) and move the first level (player) as the index
        team_df = df2.reset_index(level = 0, drop = True).reset_index()
        #iterate through each player on the team and print player, trip, and time
        for player in team_df.values.tolist():
            print(': '.join(map(str, player)), end = '; ')
        #After printing all players insert a new line
        print()
    

    输出:

    Player1: 2: 19526; 
    Player293: 1: 17656; Player333: 1: 18373; 
    Player20: 1: 1883; Player22: 1: 17338899; 
    

    这篇关于在一行中的MultiIndex pd.DataFrame的每个级别中打印所有行的文章就介绍到这了,希望我们推荐的答案对大家有所帮助,也希望大家多多支持IT屋!

    查看全文
    登录 关闭
    扫码关注1秒登录
    发送“验证码”获取 | 15天全站免登陆