Groupby根据前一行的值 [英] Groupby based on value in previous row

查看:106
本文介绍了Groupby根据前一行的值的处理方法,对大家解决问题具有一定的参考价值,需要的朋友们下面随着小编来一起学习吧!

问题描述

  100 
200
300
500
600
650
1000

I想要做一个 Groupby (或一个类似的高效构造)来获取每行的值在 100

code> 100,200,300



500,600,650



1000



这可能吗?在熊猫?由于Pandas试图允许类似于SQL的查询,因此我猜测它应该是这样的。 您可以使用类似的方法到这个问题的答案中描述的内容。这基本上是一个三步过程:


  1. 使用 shift 来计算inter-行标准,你想区分。
  2. 使用 cumsum 来总结这个标准以创建一个新的系列,其中包含一个单独的块单个价值。



  3. $ b 这是一个例子:

     >>> x = pandas.Series([100,200,300,500,600,650,1000,900,750])
    >>> ((x-x.shift())。abs()> 100).cumsum())。apply(list)
    0 [100,200,300]
    1 [ 500,600,650]
    2 [1000,900]
    3 [750]
    dtype:object

    请注意,我使用了标准> 100 ,这与您提到的< = 100 标准相反。使用这种方法,您需要使用分隔组的标准,而不是加入它们的标准,因此您必须使用否定分组标准。


    I have a column with a list of values like so:

    100
    200
    300
    500
    600
    650
    1000
    

    I want to do a Groupby (or a similar efficient construct) to get batches of rows where the value of each row is within 100 of the last row.

    In that case the batches produced from the example above would be

    100, 200, 300,

    500, 600, 650

    1000

    Is this possible to do in Pandas? Since Pandas attempts to allow for SQL-like queries, I am guessing that it should be.

    解决方案

    You can use an approach similar to that described in the answer to this question. It's basically a three-step process:

    1. Use shift to compute the inter-row criterion that you want to distinguish.
    2. Use cumsum to sum this criterion to create a new Series with separate "blocks" of a single value for each group.
    3. Group on this new Series.

    Here is an example:

    >>> x = pandas.Series([100, 200, 300, 500, 600, 650, 1000, 900, 750])
    >>> x.groupby(((x - x.shift()).abs() > 100).cumsum()).apply(list)
    0    [100, 200, 300]
    1    [500, 600, 650]
    2        [1000, 900]
    3              [750]
    dtype: object
    

    Note that I used the criterion > 100, which is the opposite of the <= 100 criterion you mentioned. With this approach, you need to use the criterion for separating groups, not the criterion for joining them, so you have to use the negation of your grouping criterion.

    这篇关于Groupby根据前一行的值的文章就介绍到这了,希望我们推荐的答案对大家有所帮助,也希望大家多多支持IT屋!

查看全文
登录 关闭
扫码关注1秒登录
发送“验证码”获取 | 15天全站免登陆