在另一列的两个条目之间查找一列的最小值 [英] Finding minimum value of a column between two entries in another column

查看:121
本文介绍了在另一列的两个条目之间查找一列的最小值的处理方法,对大家解决问题具有一定的参考价值,需要的朋友们下面随着小编来一起学习吧!

问题描述

已查看64次

0

我在数据框中有两列,包含1000多个行。列A可以采用值X,Y,None。 B列包含从50到100的随机数。

I have two columns in a data frame containing more than 1000 rows. Column A can take values X,Y,None. Column B contains random numbers from 50 to 100.

每当A列中出现非无事件时,就被视为发生次数4。因此,列A中的先前非None发生将是encen3,该事件之前的将是ocence2,而该事件之前的将是ocence1。我想查找在事件4和事件3之间的B列的最小值,并检查它是否大于在事件2和事件1之间的B列的最小值。结果可以以是或否的形式存储在数据框中的新列中。

Every time there is a non 'None' occurrence in Column A, it is considered as occurrence4. so, previous non None occurrence in Column A will be occurrence3, and the previous to that will be occurrence2 and the previous to that will be occurrence1. I want to find the minimum value of column B between occurrence4 and occurrence3 and check if it is greater than the minimum value of column B between occurrence2 and occurrence1. The results can be stored in a new column in the data frame as "YES" or "NO".

样品输入

ROWNUM  A    B
1      None  68
2      None  83
3        X   51
4      None  66
5      None  90
6        Y   81
7      None  81
8      None  100
9      None  83
10     None  78
11       X   68
12     None  53
13     None  83
14       Y   68
15     None  94
16     None  50
17     None  71
18     None  71
19     None  52
20     None  67
21     None  82
22       X   76
23     None  66
24     None  92

例如,我需要找到ROWNUM 14和ROWNUM 11之间的B列的最小值,并检查它是否大于ROWNUM 6和ROWNUM 3之间的B列的最小值接下来,我需要找到ROWNUM 22和ROWNUM 14之间的最小值,并检查它是否大于最小值n ROWNUM 11和ROWNNUM 6等等。

For example, I need to find the minimum value of Column B between ROWNUM 14 and ROWNUM 11 and check if it is GREATER THAN the minimum value of Column B between ROWNUM 6 and ROWNUM 3. Next, I need to find the minimum value between ROWNUM 22 AND ROWNUM 14 and check if it is GREATER THAN the minimum value between ROWNUM 11 and ROWNNUM 6 and so on.

编辑:

在示例数据中,我们开始从第14行开始计算,因为那是第A列中第四个没有出现的地方。第14行和第11行之间的最小值是53。第6行和第3行之间的最小值是51。由于53> 51,因此表示事件4和事件3之间的B列的最小值,大于事件2和事件1之间的B列的最小值。因此,第14行的输出为 YES或1。

In the sample data, we start our calculation from row 14, since that is where we have the fourth non none occurrence of column A. The minimum value between row 14 and row 11 is 53. The minimum value between row 6 and 3 is 51. Since 53 > 51, , it means the minimum value of column B between occurrence 4 and occurrence 3, is GREATER THAN minimum value of column B between occurrence 2 and occurrence 1. So, output at row 14 would be "YES" or 1.

接下来,在第22行,第22行和第14行之间的最小值为50。第11行和第6行之间的最小值为68。 68,表示事件4与事件3之间的最小值不大于事件2与事件1之间的最小值。因此,第22行的输出为 NO或0。

Next, at row 22, the minimum value between row 22 and row 14 is 50. The minimum value between row 11 and 6 is 68. Since 50 < 68, it means minimum between occurrence 4 and occurrence 3 is NOT GREATER THAN minimum between occurrence 2 and occurrence 1. So, output at row 22 would be "NO" or 0.

我有以下代码。

import numpy as np
import pandas as pd


df = pd.DataFrame([[0, 0]]*100, columns=list('AB'), index=range(1, 101))
df.loc[[3, 6, 11, 14, 22, 26, 38, 51, 64, 69, 78, 90, 98], 'A'] = 1
df['B'] = np.random.randint(50, 100, size=len(df))
df['result'] = df.index[df['A'] != 0].to_series().rolling(4).apply(
        lambda x: df.loc[x[2]:x[3], 'B'].min() > df.loc[x[0]:x[1], 'B'].min(), raw=True)
print(df)

此代码在A列具有输入时有效[0,1]。但是我需要一个代码,其中列A可以包含[None,X,Y]。同样,此代码产生的输出为[0,1]。我需要输出为[YES,NO]。

This code works when column A has inputs [0,1]. But I need a code where column A could contain [None, X, Y]. Also, this code produces output as [0,1]. I need output as [YES, NO] instead.

推荐答案

我读取您的示例数据如下:

I read your sample data as follows:

df = pd.read_fwf('input.txt', widths=[7, 6, 3], na_values=['None'])

注意 na_values = ['None'] ,其中规定输入(字符串
中的 None 读为 NaN

Note na_values=['None'], which provides that None in input (a string) is read as NaN.

这样DataFrame是:

This way the DataFrame is:

    ROWNUM    A    B
0        1  NaN   68
1        2  NaN   83
2        3    X   51
3        4  NaN   66
4        5  NaN   90
5        6    Y   81
6        7  NaN   81
7        8  NaN  100
8        9  NaN   83
9       10  NaN   78
10      11    X   68
11      12  NaN   53
12      13  NaN   83
13      14    Y   69
14      15  NaN   94
15      16  NaN   50
16      17  NaN   71
17      18  NaN   71
18      19  NaN   52
19      20  NaN   67
20      21  NaN   82
21      22    X   76
22      23  NaN   66
23      24  NaN   92

执行任务的代码是:

res = df.index[df.A.notnull()].to_series().rolling(4).apply(
    lambda x: df.loc[x[2]:x[3], 'B'].min() > df.loc[x[0]:x[1], 'B'].min(), raw=True)\
    .dropna().map(lambda x: 'YES' if x > 0 else 'NO').rename('Result')
df = df.join(res)
df.Result.fillna('', inplace=True)

如您所见,这部分是您的代码略有更改,其中添加了一些

As you can see, it is in part a slight change of your code, with some additions.

结果是:

    ROWNUM    A    B Result
0        1  NaN   68       
1        2  NaN   83       
2        3    X   51       
3        4  NaN   66       
4        5  NaN   90       
5        6    Y   81       
6        7  NaN   81       
7        8  NaN  100       
8        9  NaN   83       
9       10  NaN   78       
10      11    X   68       
11      12  NaN   53       
12      13  NaN   83       
13      14    Y   69    YES
14      15  NaN   94       
15      16  NaN   50       
16      17  NaN   71       
17      18  NaN   71       
18      19  NaN   52       
19      20  NaN   67       
20      21  NaN   82       
21      22    X   76     NO
22      23  NaN   66       
23      24  NaN   92       

我的解决方案相对于其他解决方案的优势在于:

The advantage of my solution over the other is that:


  • 根据需要,内容为 YES NO

  • 此内容显示用于 A 列中的非空值,
    忽略前3个,而前任不足。

  • the content is either YES or NO, just as you want,
  • this content shows up only for non-null values in A column, "ignoring" first 3, which don't have enough "predecessors".

这篇关于在另一列的两个条目之间查找一列的最小值的文章就介绍到这了,希望我们推荐的答案对大家有所帮助,也希望大家多多支持IT屋!

查看全文
登录 关闭
扫码关注1秒登录
发送“验证码”获取 | 15天全站免登陆