pandas 合并和古比 [英] Pandas merge and grouby
问题描述
我有2个熊猫数据框,如下所示.
I have 2 pandas dataframes which looks like below.
Data Frame 1:
Section Chainage Frame
R125R002 10.133 1
R125R002 10.138 2
R125R002 10.143 3
R125R002 10.148 4
R125R002 10.153 5
Data Frame 2:
Section Chainage 1 2 3 4 5 6 7 8
R125R002 10.133 0 0 1 0 0 0 0 0
R125R002 10.134 0 0 1 0 0 0 0 0
R125R002 10.135 0 0 1 0 0 0 0 0
R125R002 10.136 0 0 1 0 0 0 0 0
R125R002 10.137 0 0 1 0 0 0 0 0
R125R002 10.138 0 0 1 0 0 0 0 0
R125R002 10.139 0 0 1 0 0 0 0 0
R125R002 10.14 0 0 1 0 0 0 0 0
R125R002 10.141 0 0 1 0 0 0 0 0
R125R002 10.142 0 0 1 0 0 0 0 0
R125R002 10.143 0 0 1 0 0 0 0 0
R125R002 10.144 0 0 1 0 0 0 0 0
R125R002 10.145 0 0 1 0 0 0 0 0
R125R002 10.146 0 0 1 0 0 0 0 0
R125R002 10.147 0 0 1 0 0 0 0 0
R125R002 10.148 0 0 1 0 0 0 0 0
R125R002 10.149 0 0 1 0 0 0 0 0
R125R002 10.15 0 0 1 0 0 0 0 0
R125R002 10.151 0 0 1 0 0 0 0 0
R125R002 10.152 0 0 1 0 0 0 0 0
R125R002 10.153 0 0 1 0 0 0 0 0
必需的输出数据帧:
Section Chainage Frame 1 2 3 4 5 6 7 8
R125R002 10.133 1 0 0 1 0 0 0 0 0
R125R002 10.138 2 0 0 1 0 0 0 0 0
R125R002 10.143 3 0 0 1 0 0 0 0 0
R125R002 10.148 4 0 0 1 0 0 0 0 0
R125R002 10.153 5 0 0 1 0 0 0 0 0
数据帧2的间隔增量为1 m,而数据帧1的间隔增量为5 m.我想将数据框2合并到数据框1并应用分组依据. 第1列的分组依据是总和,第2列的最大值,colum3到8的平均值.
Dataframe 2 has increment of 1 m interval while dataframe 1 has increment of 5 m. I would like merge dataframe 2 to dataframe 1 and apply group by. Groupby for column 1 is sum, column 2 max, colum3 to 8 average.
在sql中,我将链接两个帧之间的部分,并在里程条件之间套用,然后添加groupby.
有没有办法在大熊猫中实现这一目标.
In sql, I would link section between between 2 frames and apply between condition for the chainage and then add groupby.
Is there any way to achieve this in pandas.
推荐答案
您可以首先在字典中使用定义函数按每5行进行汇总:
You can first aggregate by each 5 rows with define functions in dictionary:
d = {'Section':'first','Chainage':'first','1':'sum','2':'max', '8':'mean'}
df22 = df2.groupby([np.arange(len(df2.index)) // 5], as_index=False).agg(d)
print (df22)
Section Chainage 1 2 8
0 R125R002 10.133 0 0 0
1 R125R002 10.138 0 0 0
2 R125R002 10.143 0 0 0
3 R125R002 10.148 0 0 0
4 R125R002 10.153 0 0 0
详细信息:
print (np.arange(len(df2.index)) // 5)
[0 0 0 0 0 1 1 1 1 1 2 2 2 2 2 3 3 3 3 3 4]
然后需要 merge
:
And then need merge
:
df = df1.merge(df22, on=['Section','Chainage'])
print (df)
Section Chainage Frame 1 2 8
0 R125R002 10.133 1 0 0 0
1 R125R002 10.138 2 0 0 0
2 R125R002 10.143 3 0 0 0
3 R125R002 10.148 4 0 0 0
4 R125R002 10.153 5 0 0 0
这篇关于 pandas 合并和古比的文章就介绍到这了,希望我们推荐的答案对大家有所帮助,也希望大家多多支持IT屋!