Python pandas数据框数据透视仅适用于pivot_table(),而不适用于set_index()和unstack() [英] Python pandas dataframe pivot only works with pivot_table() but not with set_index() and unstack()
问题描述
我正在尝试在Python的Pandas数据框中转换以下类型的示例数据.我遇到了一些其他的stackoverflow答案,这些答案讨论了如何进行数据透视: pivot_table没有要聚合的数字类型
I am trying to pivot following type of sample data in Pandas dataframe in Python. I came across couple of other stackoverflow answers that discussed how to do the pivot: pivot_table No numeric types to aggregate
但是,当我使用pivot_table()
时,我可以透视数据.但是当我使用set_index()
和unstack()
时,出现以下错误:
However, when I use pivot_table()
, I am able to pivot the data. But when I use set_index()
and unstack()
, I get following error:
AttributeError:'NoneType'对象没有属性'unstack'
AttributeError: 'NoneType' object has no attribute 'unstack'
样本数据:
id responseTime label answers
ABC 2018-06-24 Category_1 [3]
ABC 2018-06-24 Category_2 [10]
ABC 2018-06-24 Category_3 [10]
DEF 2018-06-25 Category_1 [7]
DEF 2018-06-25 Category_8 [10]
GHI 2018-06-28 Category_3 [7]
所需的输出:
id responseTime category_1 category_2 category_3 category_8
ABC 2018-06-24 [3] [10] [10] NULL
DEF 2018-06-25 [7] NULL NULL [10]
GHI 2018-06-28 NULL NULL [7] NULL
这有效:
df=pdDF.pivot_table(index=['items_id','responseTime'], columns='label', values='answers', aggfunc='first')
这不起作用:
pdDF.set_index(['items_id','responseTime','label'], append=True, inplace=True).unstack('label')
我还使用了pdDF[pdDF.isnull().any(axis=1)]
来确保答案栏中没有空数据.我还使用了append=False
,但是发生了同样的错误.
I also used pdDF[pdDF.isnull().any(axis=1)]
to make sure I don't have any NULL data in answers column. I also used append=False
but same error happened.
从其他线程来看,set_index()
和unstack()
似乎比pivot_table()
更有效.我也不想使用pivot_table()
,因为它需要聚合功能并且我的答案列中不包含数字数据.我不想使用默认值(mean()
),所以最终使用了first()
.
关于为什么一种方法有效而另一种无效的任何见解?
From other threads, it seems set_index()
and unstack()
are more efficient than pivot_table()
. I also don't want to use pivot_table()
because it requires aggregation function and my answers column doesn't contain numeric data. I didn't want to use default (mean()
) so I ended up using first()
.
Any insights on why one method works and another doesn't?
推荐答案
AttributeError:'NoneType'对象没有属性'unstack'
AttributeError: 'NoneType' object has no attribute 'unstack'
在set_index
中使用inplace = True
时,它会修改数据框.它不返回任何内容(None)
.因此,不能在None
对象上使用unstack
.
When you use inplace = True
in set_index
it modified the dataframe in place. It doesn't return anything(None)
. So you can't use unstack
on None
object.
inplace:布尔值,默认为False
inplace : boolean, default False
就地修改DataFrame(不要创建新对象)
Modify the DataFrame in place (do not create a new object)
使用:
df1 = pdDF.set_index(['items_id','responseTime','label']).unstack('label')
print(df1)
# Output:
id responseTime category_1 category_2 category_3 category_8
ABC 2018-06-24 [3] [10] [10] NULL
DEF 2018-06-25 [7] NULL NULL [10]
GHI 2018-06-28 NULL NULL [7] NULL
这篇关于Python pandas数据框数据透视仅适用于pivot_table(),而不适用于set_index()和unstack()的文章就介绍到这了,希望我们推荐的答案对大家有所帮助,也希望大家多多支持IT屋!