创建新列以在 pandas 数据框中的行之间进行比较 [英] create new column that compares across rows in pandas dataframe
问题描述
我希望根据在接下来的2行中看到的值在数据框中创建一个新列.具体来说,如果接下来2行中的任何值都低于4,那么我希望当前行中的新值等于0(并且如果接下来2行中的所有值都大于4,那么我希望当前行中的新值为1).
I am looking to create a new column in a dataframe based on the values seen in the next 2 rows. Specifically, if any values in the next 2 rows are below 4, then I want the new value in the current row to be 0 (and if all values in the next 2 rows are above 4 then I want the new value in the current row to be 1).
>>> df = pandas.DataFrame({"A": [5,6,7,3,2]})
>>> df
A
0 5
1 6
2 7
3 8
4 2
>>> desired_result = pandas.DataFrame({"A": [5,6,7,8,2], "new": [1,1,0,0,0]})
>>> desired_result
A new
0 5 1
1 6 1
2 7 0
3 8 0
4 2 0
您可以在"desired_result"中看到第一个值为1,因为6和7都大于4(并且适用相同的逻辑),直到第三行中,新值变为0,因为当我们向前看时,接下来的两行(8,2),那么我们看到2是< 4,因此该值变为0.
Where you can see that in the "desired_result" the first value is 1 because 6 and 7 are both > 4 (and hte same logic applies) until in the third row the new value becomes 0 because when we look ahead to the next two rows (8,2) then we see that 2 is < 4 so the value becomes 0.
我一直在尝试使用apply函数,但是我无法弄清楚如何将接下来的2行值作为输入传递.
I have been trying to use the apply function but I cannot figure out how to pass along the next 2 row values as inputs.
我在此站点上找到了很多有关跨列比较的帮助,但无法弄清如何像我描述的那样向前看".
I have found lots of help on this site about comparing across columns, but cannot figure out how to "look ahead" like I described.
感谢您的帮助!
推荐答案
您可以将new
值设置为1,然后将loc
与shift
和lt
(小于)一起使用以设置适当的值值为零.
You can set the new
value to one and then use loc
together with shift
and lt
(less than) to set the appropriate values to zero.
df = pd.DataFrame({"A": [5, 6, 7, 8, 2]})
df['new'] = 1
df.loc[(df.A.shift(-1).lt(4)) | (df.A.shift(-2).lt(4)), 'new'] = 0
# The last value does not have any future observations and should be set to zero.
df.new.iat[-1] = 0
>>> df
A new
0 5 1
1 6 1
2 7 0
3 8 0
4 2 0
要扩展到接下来的8行,而不是2行:
To expand to the next 8 rows instead of 2:
nrows = 8
df.loc[eval(" | ".join("df.A.shift(-{0}).lt(4)".format(n)
for n in range(1, nrows + 1))), 'new'] = 0
这篇关于创建新列以在 pandas 数据框中的行之间进行比较的文章就介绍到这了,希望我们推荐的答案对大家有所帮助,也希望大家多多支持IT屋!