pandas 如何使用pd.cut() [英] Pandas how to use pd.cut()
本文介绍了 pandas 如何使用pd.cut()的处理方法,对大家解决问题具有一定的参考价值,需要的朋友们下面随着小编来一起学习吧!
问题描述
以下是代码段:
test = pd.DataFrame({'days': [0,31,45]})
test['range'] = pd.cut(test.days, [0,30,60])
输出:
days range
0 0 NaN
1 31 (30, 60]
2 45 (30, 60]
我很惊讶0不在(0,30]中,我应该怎么做才能将0归类为(0,30]?
I am surprised that 0 is not in (0, 30], what should I do to categorize 0 as (0, 30]?
推荐答案
test['range'] = pd.cut(test.days, [0,30,60], include_lowest=True)
print (test)
days range
0 0 (-0.001, 30.0]
1 31 (30.0, 60.0]
2 45 (30.0, 60.0]
查看区别:
test = pd.DataFrame({'days': [0,20,30,31,45,60]})
test['range1'] = pd.cut(test.days, [0,30,60], include_lowest=True)
#30 value is in [30, 60) group
test['range2'] = pd.cut(test.days, [0,30,60], right=False)
#30 value is in (0, 30] group
test['range3'] = pd.cut(test.days, [0,30,60])
print (test)
days range1 range2 range3
0 0 (-0.001, 30.0] [0, 30) NaN
1 20 (-0.001, 30.0] [0, 30) (0, 30]
2 30 (-0.001, 30.0] [30, 60) (0, 30]
3 31 (30.0, 60.0] [30, 60) (30, 60]
4 45 (30.0, 60.0] [30, 60) (30, 60]
5 60 (30.0, 60.0] NaN (30, 60]
或使用 numpy.searchsorted
,但使用值的days
无需排序:
arr = np.array([0,30,60])
test['range1'] = arr.searchsorted(test.days)
test['range2'] = arr.searchsorted(test.days, side='right') - 1
print (test)
days range1 range2
0 0 0 0
1 20 1 0
2 30 1 1
3 31 2 1
4 45 2 1
5 60 2 2
这篇关于 pandas 如何使用pd.cut()的文章就介绍到这了,希望我们推荐的答案对大家有所帮助,也希望大家多多支持IT屋!
查看全文