按范围对数据进行分组时如何更改bin大小? [英] How to change bin size when grouping data by ranges?

查看:72
本文介绍了按范围对数据进行分组时如何更改bin大小?的处理方法,对大家解决问题具有一定的参考价值,需要的朋友们下面随着小编来一起学习吧!

问题描述

我的问题与另一个问题的解决方案有关. 我想知道如何将垃圾箱大小从3更改为5或10或其他任何内容.如果我更改step,那还不够.我还应该更改(str(int(cat[1:3])) + "-" + str(int(cat[5:7])-1),但这是我不能做的.我收到错误ValueError: invalid literal for int() with base 10: '18, '.

My question is related to the solution of the other question. I wonder how can I change the bin size from 3 to 5 or 10 or whatever. If I change step then it's not enough. I should also change (str(int(cat[1:3])) + "-" + str(int(cat[5:7])-1), but this is what I cannot do. I get the error ValueError: invalid literal for int() with base 10: '18, '.

step=3
kwargs = dict(include_lowest=True, right=False)
bins = pd.cut(df.AVG_PERCENT_EVAL_1, bins=np.arange(18,40+step,step), **kwargs)
labels = [(str(int(cat[1:3])) + "-" + str(int(cat[5:7])-1)) for cat in bins.cat.categories]
bins.cat.categories = labels

df = df.assign(AVG_PERCENT_RANGE=bins).drop("AVG_PERCENT_EVAL_1", axis=1)
df.groupby(['GROUP', 'AVG_PERCENT_RANGE'], as_index=False).agg('mean')

推荐答案

这是您想要的吗?

In [166]: %paste
step=5
kwargs = dict(include_lowest=True, right=False)
bins=np.arange(18,40+step,step)
labels = ['{}-{}'.format(i, i+step-1) for i in bins][:-1]

df['AVG_PERCENT_RANGE'] = pd.cut(df.pop('AVG_PERCENT_EVAL_1'),
                                 bins=bins, labels=labels, **kwargs)
df.groupby(['GROUP', 'AVG_PERCENT_RANGE'], as_index=False).agg('mean')
## -- End pasted text --
Out[166]:
   GROUP AVG_PERCENT_RANGE  AVG_PERCENT_NEGATIVE  AVG_TOTAL_WAIT_TIME  AVG_TOTAL_SERVICE_TIME
0  AAAAA             18-22              6.500000            85.682099              247.880659
1  AAAAA             23-27              0.833333           103.445112              314.336474
2  AAAAA             28-32                   NaN                  NaN                     NaN
3  AAAAA             33-37                   NaN                  NaN                     NaN
4  AAAAA             38-42                   NaN                  NaN                     NaN
5  BBBBB             18-22              0.777778            63.500619              242.510146
6  BBBBB             23-27              2.000000           103.796290              313.685358
7  BBBBB             28-32                   NaN                  NaN                     NaN
8  BBBBB             33-37                   NaN                  NaN                     NaN
9  BBBBB             38-42                   NaN                  NaN                     NaN

这篇关于按范围对数据进行分组时如何更改bin大小?的文章就介绍到这了,希望我们推荐的答案对大家有所帮助,也希望大家多多支持IT屋!

查看全文
登录 关闭
扫码关注1秒登录
发送“验证码”获取 | 15天全站免登陆