pandas 合并和古比 [英] Pandas merge and grouby

查看:84
本文介绍了 pandas 合并和古比的处理方法,对大家解决问题具有一定的参考价值,需要的朋友们下面随着小编来一起学习吧!

问题描述

我有2个熊猫数据框,如下所示.

I have 2 pandas dataframes which looks like below.

Data Frame 1:  

Section Chainage    Frame  
R125R002    10.133  1  
R125R002    10.138  2  
R125R002    10.143  3  
R125R002    10.148  4  
R125R002    10.153  5  

Data Frame 2:

Section Chainage    1   2   3   4   5   6   7   8   
R125R002    10.133  0   0   1   0   0   0   0   0     
R125R002    10.134  0   0   1   0   0   0   0   0     
R125R002    10.135  0   0   1   0   0   0   0   0     
R125R002    10.136  0   0   1   0   0   0   0   0     
R125R002    10.137  0   0   1   0   0   0   0   0     
R125R002    10.138  0   0   1   0   0   0   0   0     
R125R002    10.139  0   0   1   0   0   0   0   0     
R125R002    10.14   0   0   1   0   0   0   0   0     
R125R002    10.141  0   0   1   0   0   0   0   0     
R125R002    10.142  0   0   1   0   0   0   0   0     
R125R002    10.143  0   0   1   0   0   0   0   0     
R125R002    10.144  0   0   1   0   0   0   0   0     
R125R002    10.145  0   0   1   0   0   0   0   0     
R125R002    10.146  0   0   1   0   0   0   0   0     
R125R002    10.147  0   0   1   0   0   0   0   0     
R125R002    10.148  0   0   1   0   0   0   0   0     
R125R002    10.149  0   0   1   0   0   0   0   0     
R125R002    10.15   0   0   1   0   0   0   0   0     
R125R002    10.151  0   0   1   0   0   0   0   0     
R125R002    10.152  0   0   1   0   0   0   0   0     
R125R002    10.153  0   0   1   0   0   0   0   0   

必需的输出数据帧:

Section Chainage Frame  1   2   3   4   5   6   7   8   
R125R002    10.133  1   0   0   1   0   0   0   0   0     
R125R002    10.138  2   0   0   1   0   0   0   0   0     
R125R002    10.143  3   0   0   1   0   0   0   0   0     
R125R002    10.148  4   0   0   1   0   0   0   0   0     
R125R002    10.153  5   0   0   1   0   0   0   0   0     

数据帧2的间隔增量为1 m,而数据帧1的间隔增量为5 m.我想将数据框2合并到数据框1并应用分组依据. 第1列的分组依据是总和,第2列的最大值,colum3到8的平均值.

Dataframe 2 has increment of 1 m interval while dataframe 1 has increment of 5 m. I would like merge dataframe 2 to dataframe 1 and apply group by. Groupby for column 1 is sum, column 2 max, colum3 to 8 average.

在sql中,我将链接两个帧之间的部分,并在里程条件之间套用,然后添加groupby.
有没有办法在大熊猫中实现这一目标.

In sql, I would link section between between 2 frames and apply between condition for the chainage and then add groupby.
Is there any way to achieve this in pandas.

推荐答案

您可以首先在字典中使用定义函数按每5行进行汇总:

You can first aggregate by each 5 rows with define functions in dictionary:

d = {'Section':'first','Chainage':'first','1':'sum','2':'max', '8':'mean'}
df22 = df2.groupby([np.arange(len(df2.index)) // 5], as_index=False).agg(d)
print (df22)
    Section  Chainage  1  2  8
0  R125R002    10.133  0  0  0
1  R125R002    10.138  0  0  0
2  R125R002    10.143  0  0  0
3  R125R002    10.148  0  0  0
4  R125R002    10.153  0  0  0

详细信息:

print (np.arange(len(df2.index)) // 5)
[0 0 0 0 0 1 1 1 1 1 2 2 2 2 2 3 3 3 3 3 4]

然后需要 merge :

And then need merge:

df = df1.merge(df22, on=['Section','Chainage'])
print (df)
    Section  Chainage  Frame  1  2  8
0  R125R002    10.133      1  0  0  0
1  R125R002    10.138      2  0  0  0
2  R125R002    10.143      3  0  0  0
3  R125R002    10.148      4  0  0  0
4  R125R002    10.153      5  0  0  0

这篇关于 pandas 合并和古比的文章就介绍到这了,希望我们推荐的答案对大家有所帮助,也希望大家多多支持IT屋!

查看全文
登录 关闭
扫码关注1秒登录
发送“验证码”获取 | 15天全站免登陆