pandas 在列中查找序列或模式 [英] Pandas Find Sequence or Pattern in Column

查看:81
本文介绍了 pandas 在列中查找序列或模式的处理方法,对大家解决问题具有一定的参考价值,需要的朋友们下面随着小编来一起学习吧!

问题描述

以下是我正在处理的问题的一些示例数据:

Here's some example data for the problem I'm working on:

index     Quarter    Sales_Growth
0          2001q1    0
1          2002q2    0
2          2002q3    1
3          2002q4    0
4          2003q1    0
5          2004q2    0
6          2004q3    1
7          2004q4    1

Sales_Growth列告诉我该季度是否确实存在销售增长. 0 =无增长,1 =增长.

The Sales_Growth column tells me if there was indeed sales growth in the quarter or not. 0 = no growth, 1 = growth.

首先,当连续两个季度没有销售增长时,我试图返回第一个Quarter.

First, I'm trying to return the first Quarter when there were two consecutive quarters of no sales growth.

如果数据高于此答案,则为2001q1.

With the data above this answer would be 2001q1.

然后,我想返回连续两个销售增长的第二季度,该增长是在最初两个季度没有增长之后发生的.

Then, I want to return the 2nd quarter of consecutive sales growth that occurs AFTER the initial two quarters of no growth.

这个问题的答案是2004q4.

我已经搜索了,但是找到的最接近的答案我无法上班: https://stackoverflow. com/a/26539166/3225420

I've searched and searched but the closest answer I can find I can't get to work: https://stackoverflow.com/a/26539166/3225420

在此先感谢您帮助熊猫新手,我会尽我所能,但还是坚持了下来.

Thanks in advance for helping a Pandas newbie, I'm hacking away as best I can but stuck on this one.

推荐答案

您正在进行子序列匹配.这有点奇怪,但是请忍受:

You're doing subsequence matching. This is a bit strange, but bear with me:

growth = df.Sales_Growth.astype(str).str.cat()

那给你:

'00100011'

然后:

growth.index('0011')

给您4(显然,您将添加一个常数3以获取与该模式匹配的最后一行的索引).

Gives you 4 (obviously you'd add a constant 3 to get the index of the last row matched by the pattern).

我觉得这种方法起初有点丑陋,但最终结果确实有用-您可以搜索任何固定的模式而无需其他编码.

I feel this approach starts off a bit ugly, but the end result is really usable--you can search for any fixed pattern with no additional coding.

这篇关于 pandas 在列中查找序列或模式的文章就介绍到这了,希望我们推荐的答案对大家有所帮助,也希望大家多多支持IT屋!

查看全文
登录 关闭
扫码关注1秒登录
发送“验证码”获取 | 15天全站免登陆