同一列内所有可能的排列列Pandas Dataframe [英] All possible permutations columns Pandas Dataframe within the same column

查看:63
本文介绍了同一列内所有可能的排列列Pandas Dataframe的处理方法,对大家解决问题具有一定的参考价值,需要的朋友们下面随着小编来一起学习吧!

问题描述

我在使用Postgres SQL时也遇到了类似的问题,但我认为在Postgres中确实很难完成这种任务,而且我认为python/pandas会使此操作变得容易得多,尽管我仍然不太愿意解决方案.

I had a similar question using Postgres SQL, but I figured that this kind of task is really hard to do in Postgres, and I think python/pandas would make this a lot easier, although I still can't quite come up with the solution.

我现在有一个如下所示的Pandas Dataframe:

I now have a Pandas Dataframe which looks like this:

df={'planid' : ['A', 'A', 'B', 'B', 'C', 'C'],
    'x' : ['a1', 'a2', 'b1', 'b2', 'c1', 'c2']}

df=pd.DataFrame(df)

df


   planid   x
0   A       a1
1   A       a2
2   B       b1
3   B       b2
4   C       c1
5   C       c2

我想获得Planid不相等的所有可能排列.换句话说,将Planid中的每个值都视为一个桶",如果要从每个值的x中提取值,我希望所有可能的组合 planid中的存储桶".在此特定示例中,总共有8个置换{{a1,b1,c1),(a1,b2,c1),(a1,b1,c2),(a1,b2,c2),(a2,b1,c1) ,(a2,b2,c1),(a2,b1,c2),(a2,b2,c2)}.

I want to get all possible permutations where planid are not equal to each other. In other words, think of each value in planid as a "bucket" and I want all possible combinations if I were to draw values from x from each "bucket" in planid. In this particular example, there are 8 total permutations {(a1, b1, c1), (a1, b2, c1), (a1, b1, c2), (a1, b2, c2), (a2, b1, c1), (a2, b2, c1), (a2, b1, c2), (a2, b2, c2)}.

但是,我希望得到的数据框为三列,分别为planidx和另一列,也许命名为permutation_counter.最终数据帧具有用permutation_counter标记的所有不同排列.换句话说,我希望我的最终数据框看起来像

However, I want my resulting data frame to be three columns, planid, x and another column, perhaps named permutation_counter. The final data frame has all the different permutations labeled with permutation_counter. In other words, I want my final dataframe to look like

       planid   x  permutation_counter
    0   A       a1     1
    1   B       b1     1
    2   C       c1     1 
    3   A       a1     2
    4   B       b2     2
    5   C       c1     2
    6   A       a1     3
    7   B       b1     3
    8   C       c2     3
    9   A       a1     4
    10  B       b2     4
    11  C       c2     4
    12  A       a2     5
    13  B       b1     5
    14  C       c1     5
    15  A       a2     6
    16  B       b2     6
    17  C       c1     6
    18  A       a2     7
    19  B       b1     7
    20  C       c2     7
    21  A       a2     8
    22  B       b2     8
    23  C       c2     8

任何帮助将不胜感激!

推荐答案

我试图将尽可能多的步骤链接在一起.分解它们以查看每个步骤的作用:)

I was trying to chain as many steps together as possible. Break them down to see what each step does :)

df2 = pd.DataFrame(index=pd.MultiIndex.from_product([subdf['x'] for p, subdf in df.groupby('planid')], names=df.planid.unique())).reset_index().stack().reset_index()

df2.columns = ['permutation_counter', 'planid', 'x']
df2['permutation_counter'] += 1

print df2[['planid', 'x', 'permutation_counter']]

   planid   x  permutation_counter
0       A  a1                    1
1       B  b1                    1
2       C  c1                    1
3       A  a1                    2
4       B  b1                    2
5       C  c2                    2
6       A  a1                    3
7       B  b2                    3
8       C  c1                    3
9       A  a1                    4
10      B  b2                    4
11      C  c2                    4
12      A  a2                    5
13      B  b1                    5
14      C  c1                    5
15      A  a2                    6
16      B  b1                    6
17      C  c2                    6
18      A  a2                    7
19      B  b2                    7
20      C  c1                    7
21      A  a2                    8
22      B  b2                    8
23      C  c2                    8

这篇关于同一列内所有可能的排列列Pandas Dataframe的文章就介绍到这了,希望我们推荐的答案对大家有所帮助,也希望大家多多支持IT屋!

查看全文
登录 关闭
扫码关注1秒登录
发送“验证码”获取 | 15天全站免登陆