python bin数据并返回bin中点(可能使用pandas.cut和qcut) [英] python bin data and return bin midpoint (maybe using pandas.cut and qcut)

查看:86
本文介绍了python bin数据并返回bin中点(可能使用pandas.cut和qcut)的处理方法,对大家解决问题具有一定的参考价值,需要的朋友们下面随着小编来一起学习吧!

问题描述

我可以让 pandas cut/qcut 函数返回 bin 端点或 bin 中点而不是 bin 标签字符串吗?

Can I make pandas cut/qcut function to return with bin endpoint or bin midpoint instead of a string of bin label?

目前

pd.cut(pd.Series(np.arange(11)), bins = 5)

0     (-0.01, 2]
1     (-0.01, 2]
2     (-0.01, 2]
3         (2, 4]
4         (2, 4]
5         (4, 6]
6         (4, 6]
7         (6, 8]
8         (6, 8]
9        (8, 10]
10       (8, 10]
dtype: category

带有类别/字符串值.我想要的是

with category / string values. What I want is

0     1.0
1     1.0
2     1.0
3     3.0
4     3.0

用数值表示 bin 的边缘或中点.

with numerical values representing edge or midpoint of the bin.

推荐答案

我看到这是一个旧帖子,但无论如何我都会冒昧地回答.

I see that this is an old post but I will take the liberty to answer it anyway.

现在可以(参考@chrisb 的回答)使用leftright 访问分类区间的端点.

It is now possible (ref @chrisb's answer) to access the endpoints for categorical intervals using left and right.

s = pd.cut(pd.Series(np.arange(11)), bins = 5)

mid = [(a.left + a.right)/2 for a in s]
Out[34]: [0.995, 0.995, 0.995, 3.0, 3.0, 5.0, 5.0, 7.0, 7.0, 9.0, 9.0]

由于间隔向左开放,向右封闭,第一个"间隔(从 0 开始的那个)实际上从 -0.01 开始.要使用 0 作为左值获得中点,您可以这样做

Since intervals are open to the left and closed to the right, the 'first' interval (the one starting at 0), actually starts at -0.01. To get a midpoint using 0 as the left value you can do this

mid_alt = [(a.left + a.right)/2 if a.left != -0.01 else a.right/2 for a in s]
Out[35]: [1.0, 1.0, 1.0, 3.0, 3.0, 5.0, 5.0, 7.0, 7.0, 9.0, 9.0]

或者,你可以说区间向左闭,向右开

Or, you can say that the intervals are closed to the left and open to the right

t = pd.cut(pd.Series(np.arange(11)), bins = 5, right=False)
Out[38]: 
0       [0.0, 2.0)
1       [0.0, 2.0)
2       [2.0, 4.0)
3       [2.0, 4.0)
4       [4.0, 6.0)
5       [4.0, 6.0)
6       [6.0, 8.0)
7       [6.0, 8.0)
8     [8.0, 10.01)
9     [8.0, 10.01)
10    [8.0, 10.01)

但是,如您所见,您在最后一个时间间隔遇到了同样的问题.

But, as you see, you get the same problem at the last interval.

这篇关于python bin数据并返回bin中点(可能使用pandas.cut和qcut)的文章就介绍到这了,希望我们推荐的答案对大家有所帮助,也希望大家多多支持IT屋!

查看全文
登录 关闭
扫码关注1秒登录
发送“验证码”获取 | 15天全站免登陆