pandas - 具有非数字值的pivot_table？（DataError：无数字类型聚合） [英] pandas - pivot_table with non-numeric values? (DataError: No numeric types to aggregate)

查看：1608 发布时间：2017/3/26 1:51:50 python pandas pivot-table dataframe

本文介绍了 pandas - 具有非数字值的pivot_table？（DataError：无数字类型聚合）的处理方法，对大家解决问题具有一定的参考价值，需要的朋友们下面随着小编来一起学习吧！

问题描述

 将大熊猫导入pd 
 
 df1 = pd.DataFrame（{'index'：range（8），
'variable1'：[A，A，B，B，A B，B，A]，
'variable2'：[a，b，a，b，a，b b]，
'variable3'：[x，x，x，y，y，y，x，y]，
'result'：[on，off，off，on，on，off，off，on]}）
 
 df1。 pivot_table（values ='result'，rows ='index'，cols = ['variable1'，'variable2'，'variable3']）

但是我得到： DataError：没有数字类型来聚合。

当我将结果值更改为数字时，按照预期的方式工作：

  df2 = pd.DataFrame（{'index'：range（8） ，
'variable1'：[A，A，B，B，A，B，B，A]，
'variable2' ：[a，b，a，b，a，b，a，b]，
'variable3'：[x ，x，y，y，y，x，y]，
'result'：[1,0,0,1,1,0,0， 1]}） 
 
 df2.pivot_table（values ='result'，rows ='index'，cols = ['variable1'，'variable2'，'variable3']）

我得到了我所需要的：

  variable1 AB 
 variable2 abab 
 variable3 xyxyxy 
 index 
 0 1 NaN NaN NaN NaN NaN 
 1 NaN NaN 0 NaN NaN NaN 
 2 NaN NaN NaN NaN 0 NaN 
 3 NaN NaN NaN NaN NaN 1 
 4 NaN 1 NaN NaN NaN NaN 
 5 NaN NaN NaN NaN NaN 0 
 6 NaN NaN NaN NaN 0 NaN 
 7 NaN NaN NaN 1 NaN NaN

我知道我可以将字符串映射到数值，然后将操作，但也许有一个更优雅的解决方案？

解决方案

我的原始回复是基于熊猫0.14.1，从那时起在pivot_table函数中有很多改变（rows - >）索引，cols - >列...）

此外，我发布的原始lambda技巧似乎不再适用于Pandas 0.18。您必须提供减少功能（即使是最小，最大或均值）。但是即使这样似乎是不正确的 - 因为我们没有减少数据集，只是转换它....所以我看起来更加困难...

  import pandas as pd 
 
 df1 = pd.DataFrame（{'index'：range（8），
'variable1'：[A，A B，B，A，B，B，A]，
'variable2'：[a，b，a a，b，a，b]，
'variable3'：[x，x，x，y，y，y x，y]，
'result'：[on，off，off，on，on，off，off，on }）
 
＃这些是最后在多索引列中的列。 
 unfack_cols = ['variable1'，'variable2'，'variable3']

使用索引+您要堆叠的列设置数据索引，然后使用级别arg调用拆分。

  df1 .set_index（['index'] + unsack_cols）.unstack（level = unfack_cols）

结果数据框是下面。

I'm trying to do a pivot of a table containing strings as results.

import pandas as pd

df1 = pd.DataFrame({'index' : range(8),
'variable1' : ["A","A","B","B","A","B","B","A"],
'variable2' : ["a","b","a","b","a","b","a","b"],
'variable3' : ["x","x","x","y","y","y","x","y"],
'result': ["on","off","off","on","on","off","off","on"]})

df1.pivot_table(values='result',rows='index',cols=['variable1','variable2','variable3'])

But I get: DataError: No numeric types to aggregate.

This works as intended when I change result values to numbers:

df2 = pd.DataFrame({'index' : range(8),
'variable1' : ["A","A","B","B","A","B","B","A"],
'variable2' : ["a","b","a","b","a","b","a","b"],
'variable3' : ["x","x","x","y","y","y","x","y"],
'result': [1,0,0,1,1,0,0,1]})

df2.pivot_table(values='result',rows='index',cols=['variable1','variable2','variable3'])

And I get what I need:

variable1   A               B    
variable2   a       b       a   b
variable3   x   y   x   y   x   y
index                            
0           1 NaN NaN NaN NaN NaN
1         NaN NaN   0 NaN NaN NaN
2         NaN NaN NaN NaN   0 NaN
3         NaN NaN NaN NaN NaN   1
4         NaN   1 NaN NaN NaN NaN
5         NaN NaN NaN NaN NaN   0
6         NaN NaN NaN NaN   0 NaN
7         NaN NaN NaN   1 NaN NaN

I know I can map the strings to numerical values and then reverse the operation, but maybe there is a more elegant solution?

解决方案

My original reply was based on Pandas 0.14.1, and since then, many things changed in the pivot_table function (rows --> index, cols --> columns... )

Additionally, it appears that the original lambda trick I posted no longer works on Pandas 0.18. You have to provide a reducing function (even if it is min, max or mean). But even that seemed improper - because we are not reducing the data set, just transforming it.... So I looked harder at unstack...

import pandas as pd

df1 = pd.DataFrame({'index' : range(8),
'variable1' : ["A","A","B","B","A","B","B","A"],
'variable2' : ["a","b","a","b","a","b","a","b"],
'variable3' : ["x","x","x","y","y","y","x","y"],
'result': ["on","off","off","on","on","off","off","on"]})

# these are the columns to end up in the multi-index columns.
unstack_cols = ['variable1', 'variable2', 'variable3']

First, set an index on the data using the index + the columns you want to stack, then call unstack using the level arg.

df1.set_index(['index'] + unstack_cols).unstack(level=unstack_cols)

Resulting dataframe is below.

这篇关于 pandas - 具有非数字值的pivot_table？（DataError：无数字类型聚合）的文章就介绍到这了，希望我们推荐的答案对大家有所帮助，也希望大家多多支持IT屋！

查看全文

pandas - 具有非数字值的pivot_table？（DataError：无数字类型聚合） [英] pandas - pivot_table with non-numeric values? (DataError: No numeric types to aggregate)

问题描述

相关文章

Python最新文章

热门教程

热门工具

登录关闭

pandas - 具有非数字值的pivot_table？ （DataError：无数字类型聚合） [英] pandas - pivot_table with non-numeric values? (DataError: No numeric types to aggregate)

问题描述

相关文章

Python最新文章

热门教程

热门工具

登录 关闭

pandas - 具有非数字值的pivot_table？（DataError：无数字类型聚合） [英] pandas - pivot_table with non-numeric values? (DataError: No numeric types to aggregate)

登录关闭