遇到多个"groupby"问题带有变量和类别(合并数据) [英] Having Trouble with multiple "groupby" with a variable and a category (binned data)

查看:154
本文介绍了遇到多个"groupby"问题带有变量和类别(合并数据)的处理方法,对大家解决问题具有一定的参考价值,需要的朋友们下面随着小编来一起学习吧!

问题描述

df.dtypes

Close       float64
eqId          int64
date         object
IntDate       int64
expiry        int64
delta         int64
ivMid       float64
conf        float64
Skew        float64
psc         float64
vol_B      category
dtype: object

gb = df.groupby([df['vol_B'],df['expiry']])

gb.describe()

我收到一条很长的错误消息,最后一行是

I get a long error message with the final line being

AttributeError: 'Categorical' object has no attribute 'flags'

当我分别对它们分别执行groupby时,它们各自(独立地)工作都很好,我只是不能执行多个groupby,而其中一个变量是"bin".

When I perform a groupby on each of them separately they each (independently) work great, I just can not perform multiple groupby with one of the variables being a "bin."

此外,当我使用其他2个变量时,我可以执行多个groupby& ndash,我可以成功执行以下操作:

Also, when I use 2 other variables I am able to perform multiple groupby &ndash I successfully performed this:

gb = df.groupby([df['delta'],df['expiry']])

推荐答案

我遇到了与OP类似的问题,并且在寻找解决方案时发现了这个问题.在浏览了分类变量的文档之后,一个对我有用的简单技巧是更改了分组前分类变量的类型.

I was facing a similar issue as the OP and found this question while looking for solutions. A simple hack that worked for me after going through the pandas documentation for categorical variables was to change the type of the categorical variable before grouping.

由于vol_B是您的情况下的分类变量,因此您应该尝试以下操作

Since vol_B is the categorical variable in your case, you should try the following

#Depending on the content of vol_B you can do astype(int) or astype(float) as well.
gb = df.groupby([df['vol_B'].astype(str), df['expiry']])

我还没有详细说明为什么这种方法有效,而那没有用,但是如果我进入其中,我将更新答案.

I haven't gone into the details of why this works and that doesn't but if I get into it, I will update the answer.

这篇关于遇到多个"groupby"问题带有变量和类别(合并数据)的文章就介绍到这了,希望我们推荐的答案对大家有所帮助,也希望大家多多支持IT屋!

查看全文
登录 关闭
扫码关注1秒登录
发送“验证码”获取 | 15天全站免登陆