当条件为真时, pandas 将数据帧分为多个 [英] Pandas split dataframe into multiple when condition is true

查看:69
本文介绍了当条件为真时, pandas 将数据帧分为多个的处理方法,对大家解决问题具有一定的参考价值,需要的朋友们下面随着小编来一起学习吧!

问题描述

我有一个数据框,例如下面的df.我想为条件为true的每个数据块创建一个新的数据框,以便将其返回df_1,df_2 .... df_n.

I have a dataframe, like df below. I want to create a new dataframe for every chunk of data where the condition is true, so that it would be return df_1, df_2....df_n.

|      df           |       |  df_1 |   | df_2  |
| Value | Condition |       | Value |   | Value |
|-------|-----------|       |-------|---|-------|
| 2     | True      |   |   | 2     |   | 0     |
| 5     | True      |   |   | 5     |   | 5     |
| 4     | True      |   |   | 4     |   |       |
| 4     | False     |   |   |       |   |       |
| 2     | False     |   |   |       |   |       |
| 0     | True      |   |   |       |   |       |
| 5     | True      |   |   |       |   |       |
| 7     | False     |   |   |       |   |       |
| 8     | False     |   |   |       |   |       |      
| 9     | False     |   |   |       |   |       |

我唯一的想法是循环遍历数据帧,为每个True值块返回起始索引和结束索引,然后通过循环遍历返回的索引创建新的数据帧,并为每个起始/结束对返回以下内容:

My only idea is to loop through the dataframe, returning the start and end index for every chunk of True values, then creating new dataframes with a loop going over the returned indices returning something like this for each start/end pair:

newdf = df.iloc[start:end]

但是这样做似乎效率低下.

But doing that seems inefficient.

推荐答案

这是另一种解决方案.请注意 consecutive_groups 食谱来自 more_itertools 库.

This is an alternative solution. Note the consecutive_groups recipe is from more_itertools library.

from itertools import groupby
from operator import itemgetter

def consecutive_groups(iterable, ordering=lambda x: x):
    for k, g in groupby(enumerate(iterable), key=lambda x: x[0] - ordering(x[1])):
        yield map(itemgetter(1), g)

grps = consecutive_groups(df[df.Condition].index)

dfs = {i: df.iloc[list(j)] for i, j in enumerate(grps, 1)}

# {1:    Value Condition
# 0      2      True
# 1      5      True
# 2      4      True,
# 2:    Value Condition
# 5      0      True
# 6      5      True}

这篇关于当条件为真时, pandas 将数据帧分为多个的文章就介绍到这了,希望我们推荐的答案对大家有所帮助,也希望大家多多支持IT屋!

查看全文
登录 关闭
扫码关注1秒登录
发送“验证码”获取 | 15天全站免登陆