pandas 在列中查找序列或模式 [英] Pandas Find Sequence or Pattern in Column
问题描述
以下是我正在处理的问题的一些示例数据:
Here's some example data for the problem I'm working on:
index Quarter Sales_Growth
0 2001q1 0
1 2002q2 0
2 2002q3 1
3 2002q4 0
4 2003q1 0
5 2004q2 0
6 2004q3 1
7 2004q4 1
Sales_Growth
列告诉我该季度是否确实存在销售增长. 0 =无增长,1 =增长.
The Sales_Growth
column tells me if there was indeed sales growth in the quarter or not. 0 = no growth, 1 = growth.
首先,当连续两个季度没有销售增长时,我试图返回第一个Quarter
.
First, I'm trying to return the first Quarter
when there were two consecutive quarters of no sales growth.
如果数据高于此答案,则为2001q1
.
With the data above this answer would be 2001q1
.
然后,我想返回连续两个销售增长的第二季度,该增长是在最初两个季度没有增长之后发生的.
Then, I want to return the 2nd quarter of consecutive sales growth that occurs AFTER the initial two quarters of no growth.
这个问题的答案是2004q4
.
我已经搜索了,但是找到的最接近的答案我无法上班: https://stackoverflow. com/a/26539166/3225420
I've searched and searched but the closest answer I can find I can't get to work: https://stackoverflow.com/a/26539166/3225420
在此先感谢您帮助熊猫新手,我会尽我所能,但还是坚持了下来.
Thanks in advance for helping a Pandas newbie, I'm hacking away as best I can but stuck on this one.
推荐答案
您正在进行子序列匹配.这有点奇怪,但是请忍受:
You're doing subsequence matching. This is a bit strange, but bear with me:
growth = df.Sales_Growth.astype(str).str.cat()
那给你:
'00100011'
然后:
growth.index('0011')
给您4(显然,您将添加一个常数3以获取与该模式匹配的最后一行的索引).
Gives you 4 (obviously you'd add a constant 3 to get the index of the last row matched by the pattern).
我觉得这种方法起初有点丑陋,但最终结果确实有用-您可以搜索任何固定的模式而无需其他编码.
I feel this approach starts off a bit ugly, but the end result is really usable--you can search for any fixed pattern with no additional coding.
这篇关于 pandas 在列中查找序列或模式的文章就介绍到这了,希望我们推荐的答案对大家有所帮助,也希望大家多多支持IT屋!