大 pandas :删除另一个系列的时间索引(即排除时间范围)的时间间隔内的所有行 [英] pandas: Remove all rows within time interval of another series's time index (i.e. time range exclusion)
问题描述
假设我有两个数据框:
#df1
time
2016-09-12 13:00:00.017 1.0
2016-09-12 13:00:03.233 1.0
2016-09-12 13:00:10.256 1.0
2016-09-12 13:00:19.605 1.0
#df2
time
2016-09-12 13:00:00.017 1.0
2016-09-12 13:00:00.233 0.0
2016-09-12 13:00:01.016 1.0
2016-09-12 13:00:01.505 0.0
2016-09-12 13:00:06.017 1.0
2016-09-12 13:00:07.233 0.0
2016-09-12 13:00:08.256 1.0
2016-09-12 13:00:19.705 0.0
我想删除df2
中所有长达df1
时间索引+1秒的行,所以产生:
I want to remove all rows in df2
that are up to +1 second of the time indices in df1
, so yielding:
#result
time
2016-09-12 13:00:01.505 0.0
2016-09-12 13:00:06.017 1.0
2016-09-12 13:00:07.233 0.0
2016-09-12 13:00:08.256 1.0
最有效的方法是什么?对于API中的时间范围排除,我看不到任何有用的东西.
What's the most efficient way to do this? I don't see anything useful for time range exclusions in the API.
推荐答案
You can use pd.merge_asof
which is a new inclusion starting with 0.19.0
and also accepts a tolerance argument to match +/- that specified amount of time interval.
# Assuming time to be set as the index axis for both df's
df1.reset_index(inplace=True)
df2.reset_index(inplace=True)
df2.loc[pd.merge_asof(df2, df1, on='time', tolerance=pd.Timedelta('1s')).isnull().any(1)]
请注意,默认匹配是在向后方向上进行的,这意味着选择在其"on"
键(为df2
).因此,tolerance
参数仅在此方向上(向后)延伸,从而导致-
匹配范围.
Note that default matching is carried out in the backwards direction, which means that selection occurs at the last row in the right DataFrame (df1
) whose "on"
key (which is "time"
) is less than or equal to the left's (df2
) key. Hence, the tolerance
parameter extends only in this direction (backward) resulting in a -
range of matching.
To have both forward as well as backward lookups possible, starting with 0.20.0
this can be achieved by making use of direction='nearest'
argument and including it in the function call. Due to this, the tolerance
also gets extended both ways resulting in a +/-
bandwidth range of matching.
这篇关于大 pandas :删除另一个系列的时间索引(即排除时间范围)的时间间隔内的所有行的文章就介绍到这了,希望我们推荐的答案对大家有所帮助,也希望大家多多支持IT屋!