Python pandas数据框:查找小于或等于当前行的值的最后一次出现 [英] Python pandas dataframe: Find last occurrence of value less-than-or-equal-to current row

查看:801
本文介绍了Python pandas数据框:查找小于或等于当前行的值的最后一次出现的处理方法,对大家解决问题具有一定的参考价值,需要的朋友们下面随着小编来一起学习吧!

问题描述

我有2个熊猫数据框:

df1:

   ksat  muacres  SAND  SILT  CLAY
     0     5326     0     0     0
   0.1     4346     0     0     0
   0.4     4146     0     0     0
   0.8     3476     0     0     0
   1.2     2006     0     0     0

并且 df2:

   PERCENTILE      ksat      b  theta
0           1  0.370684  11.55   46.8
1           2  0.558053  11.55   46.8
2           3  0.794836  10.39   46.8
3           4  0.962329  11.55   46.8
4           5  1.202368  10.39   46.8

我想在df1中添加一列"st",其中对于df1中的每一行,我都找到df2中的ksat值,该值大于或等于df1中的ksat值.对于此示例,结果将是:

I want to add a column, 'st' to df1, where for each row in df1, I find the ksat value in df2, which is greater than or equal to ksat value in df1. For this example, the result would be:

df1:

   ksat  muacres  SAND  SILT  CLAY  st
     0     5326     0     0     0     1
   0.1     4346     0     0     0     1
   0.4     4146     0     0     0     2
   0.8     3476     0     0     0     4
   1.2     2006     0     0     0     5

当前,我在一个循环中循环,但这效率很低.熊猫还有更好的方法吗?

Currently, I am looping within a loop, but that is very inefficient. Any better ways in pandas?

谢谢!

推荐答案

一种方法是合并两次.首先只包含百分比列,这样您就可以向后填充:

One way is to merge twice. First with just the percentile column so you can backwards fill:

In [11]: merged = df1[['ksat']].merge(df2[['ksat', 'PERCENTILE']], how='outer', sort=True)

In [12]: merged
Out[12]:
       ksat  PERCENTILE
0  0.000000         NaN
1  0.100000         NaN
2  0.370684           1
3  0.400000         NaN
4  0.558053           2
5  0.794836           3
6  0.800000         NaN
7  0.962329           4
8  1.200000         NaN
9  1.202368           5

In [13]: merged.bfill()
Out[13]:
       ksat  PERCENTILE
0  0.000000           1
1  0.100000           1
2  0.370684           1
3  0.400000           2
4  0.558053           2
5  0.794836           3
6  0.800000           4
7  0.962329           4
8  1.200000           5
9  1.202368           5

,然后与以下结果合并:

and then merge with this result:

In [14]: df.merge(merged.bfill())
Out[14]:
   ksat  muacres  SAND  SILT  CLAY  PERCENTILE
0   0.0     5326     0     0     0           1
1   0.1     4346     0     0     0           1
2   0.4     4146     0     0     0           2
3   0.8     3476     0     0     0           4
4   1.2     2006     0     0     0           5

这篇关于Python pandas数据框:查找小于或等于当前行的值的最后一次出现的文章就介绍到这了,希望我们推荐的答案对大家有所帮助,也希望大家多多支持IT屋!

查看全文
登录 关闭
扫码关注1秒登录
发送“验证码”获取 | 15天全站免登陆