删除特定值之前的前几行-Pandas [英] Removing the first rows before a specific value - pandas
问题描述
我正在尝试删除组初始值之前的所有行.例如,如果我的max_value = 250
,则应删除该值之前的组的所有行.如果该组的结果值再次小于或等于250,则不会将其删除.
I am trying to remove all rows before an initial value for a group. For instance, if my max_value = 250
, then all rows for a group before that value should be removed. If a consequtive value of 250 or less appears again for that group, it is not removed.
import pandas as pd
df = pd.DataFrame({
'date': ['2019-01-01','2019-02-01','2019-03-01', '2019-04-01',
'2019-01-01','2019-02-01','2019-03-01', '2019-04-01',
'2019-01-01','2019-02-01','2019-03-01', '2019-04-01'],
'Asset': ['Asset A', 'Asset A', 'Asset A', 'Asset A', 'Asset A', 'Asset A', 'Asset B', 'Asset B',
'Asset B', 'Asset B', 'Asset B', 'Asset B'],
'Monthly Value': [100, 200, 300, 400, 500, 600, 100, 200, 300, 200, 300, 200]
})
unique_list = list(df['Asset'].unique())
max_value = 250
print(df)
date Asset Monthly Value
0 2019-01-01 Asset A 100
1 2019-02-01 Asset A 200
2 2019-03-01 Asset A 300
3 2019-04-01 Asset A 400
4 2019-01-01 Asset A 500
5 2019-02-01 Asset A 600
6 2019-03-01 Asset B 100
7 2019-04-01 Asset B 200
8 2019-01-01 Asset B 300
9 2019-02-01 Asset B 200
10 2019-03-01 Asset B 300
11 2019-04-01 Asset B 200
如果阈值或max_value
为250,则数据帧应如下所示(如下).请注意,第一次为组检测到小于250的值时,将删除所有这些行.如果再次显示250或更高的值,则将其保留.任何帮助将不胜感激.
if the threshold or max_value
is 250, then the dataframe should look like this (below). Notice the first time a value under 250 is detected for a group, all of those rows are removed. If the value 250 or higher is shown again, it is kept. Any help would be appreciated.
date Asset Monthly Value
2 2019-03-01 Asset A 300
3 2019-04-01 Asset A 400
4 2019-01-01 Asset A 500
5 2019-02-01 Asset A 600
8 2019-01-01 Asset B 300
9 2019-02-01 Asset B 200
10 2019-03-01 Asset B 300
11 2019-04-01 Asset B 200
推荐答案
这应该可以解决问题:
df[df.groupby('Asset')['Monthly Value'].apply(lambda x: x.gt(max_value).cumsum().ne(0))]
收益:
date Asset Monthly Value
2 2019-03-01 Asset A 300
3 2019-04-01 Asset A 400
4 2019-01-01 Asset A 500
5 2019-02-01 Asset A 600
8 2019-01-01 Asset B 300
9 2019-02-01 Asset B 200
10 2019-03-01 Asset B 300
11 2019-04-01 Asset B 200
此外,如果将最大值存储在像max_value = {'Asset A': 250, 'Asset B': 250}
这样的字典中,则可以执行以下操作来获得相同的结果:
Additionally, if you store your max values in a dictionary like max_value = {'Asset A': 250, 'Asset B': 250}
, you can do the following to achieve the same result:
df[df.groupby('Asset')['Monthly Value'].apply(lambda x: x.gt(max_value[x.name]).cumsum().ne(0))]
这篇关于删除特定值之前的前几行-Pandas的文章就介绍到这了,希望我们推荐的答案对大家有所帮助,也希望大家多多支持IT屋!