过滤和选择使用python pandas制作的数据透视表 [英] Filtering and selecting from pivot tables made with python pandas

查看：2167 发布时间：2018/8/2 14:10:02 python indexing pivot pivot-table pandas

本文介绍了过滤和选择使用python pandas制作的数据透视表的处理方法，对大家解决问题具有一定的参考价值，需要的朋友们下面随着小编来一起学习吧！

问题描述

我正在努力使用Python pandas 包中的层次索引。具体来说，我不明白如何在数据透视后过滤和比较数据。

I'm struggling with hierarchical indexes in the Python pandas package. Specifically I don't understand how to filter and compare data in rows after it has been pivoted.

以下是文档中的示例表：

Here is the example table from the documentation:

import pandas as pd
import numpy as np

In [1027]: df = pd.DataFrame({'A' : ['one', 'one', 'two', 'three'] * 6,
                              'B' : ['A', 'B', 'C'] * 8,
                              'C' : ['foo', 'foo', 'foo', 'bar', 'bar', 'bar'] * 4,
                              'D' : np.random.randn(24),
                              'E' : np.random.randn(24)})

In [1029]: pd.pivot_table(df, values='D', rows=['A', 'B'], cols=['C'])
Out[1029]: 
    C             bar       foo
    A     B                    
    one   A -1.154627 -0.243234
          B -1.320253 -0.633158
          C  1.188862  0.377300
    three A -1.327977       NaN
          B       NaN -0.079051
          C -0.832506       NaN
    two   A       NaN -0.128534
          B  0.835120       NaN
          C       NaN  0.838040

我想要分析如下：

1）在列属性上过滤此表，例如选择负 foo 的行：

1) Filter this table on column attributes, for example selecting rows with negative foo:

    C             bar       foo
    A     B                    
    one   A -1.154627 -0.243234
          B -1.320253 -0.633158
    three B       NaN -0.079051
    two   A       NaN -0.128534

2）比较不同的 A 系列组之间剩余的 B 系列值？我不知道如何访问这些信息： {'one'：['A'，'B']，'two'：['A']，'three'：['B'] } 并确定哪个系列 B 值对每个键是唯一的，或者在多个键组中看到等等。

2) Compare the remaining B series values between the distinct A series groups? I am not sure how to access this information: {'one':['A','B'], 'two':['A'], 'three':['B']} and determine which series B values are unique to each key, or seen in multiple key groups, etc

有没有办法直接在数据透视表结构中执行此操作，或者我是否需要将其转换回 pandas dataframe ？

Is there a way to do this directly within the pivot table structure, or do I need to convert this back in to a pandas dataframe?

更新：我认为此代码是朝着正确方向迈出的一步。它至少让我可以访问此表中的各个值，但我仍然在对系列值进行硬编码：

Update: I think this code is a step in the right direction. It at least lets me access individual values within this table, but I am still hard-coding the series vales:

table = pivot_table(df, values='D', rows=['A', 'B'], cols=['C'])
table.ix['one', 'A']

过滤和选择使用python pandas制作的数据透视表 [英] Filtering and selecting from pivot tables made with python pandas

问题描述

推荐答案

相关文章

Python最新文章

热门教程

热门工具

登录关闭

过滤和选择使用python pandas制作的数据透视表 [英] Filtering and selecting from pivot tables made with python pandas

问题描述

推荐答案

相关文章

Python最新文章

热门教程

热门工具

登录 关闭

登录关闭