Python:基于 Pandas 中 2 列的分箱 [英] Python: Binning based on 2 columns in Pandas
本文介绍了Python:基于 Pandas 中 2 列的分箱的处理方法,对大家解决问题具有一定的参考价值,需要的朋友们下面随着小编来一起学习吧!
问题描述
在 Pandas 中寻找一种基于 2 列的快速优雅的 bin 方法.
Looking for a quick and elegant way to bin based on 2 columns in Pandas.
这是我的数据框
filename height width
0 shopfronts_23092017_3_285.jpg 750.0 560.0
1 shopfronts_200.jpg 4395.0 6020.0
2 shopfronts_25092017_eateries_98.jpg 414.0 621.0
3 shopfronts_101.jpg 480.0 640.0
4 shopfronts_138.jpg 3733.0 8498.0
5 shopfronts_25092017_eateries_95.jpg 187.0 250.0
6 shopfronts_25092017_neon_33.jpg 100.0 200.0
7 shopfronts_322.jpg 682.0 1024.0
8 shopfronts_171.jpg 800.0 600.0
9 shopfronts_23092017_3_35.jpg 120.0 210.0
我需要根据 2 列高度对记录进行分类宽度(图像分辨率)
I need to bin the records based on 2 columns height & width (image resolutions)
我正在寻找这样的东西
filename height width group
0 shopfronts_23092017_3_285.jpg 750.0 560.0 g3
1 shopfronts_200.jpg 4395.0 6020.0 g4
2 shopfronts_25092017_eateries_98.jpg 414.0 621.0 others
3 shopfronts_101.jpg 480.0 640.0 others
4 shopfronts_138.jpg 3733.0 8498.0 g4
5 shopfronts_25092017_eateries_95.jpg 187.0 250.0 g1
6 shopfronts_25092017_neon_33.jpg 100.0 200.0 g1
7 shopfronts_322.jpg 682.0 1024.0 others
8 shopfronts_171.jpg 800.0 600.0 g3
9 shopfronts_23092017_3_35.jpg 120.0 210.0 g1
where
g1: <= 400x300]
g2: (400x300, 640x480]
g3: (640x480, 800x600]
g4: > 800x600
others: If they don't comply to the requirement (Ex: records 7,2,3 - either height or width will fall in the categories defined but not both)
希望使用组列获取频率计数.如果这不是最好的方法,如果有更好的方法,请告诉我.
Looking to get the frequency count using group column. If this is not the best way to go about it and if there is a better way, kindly let me know.
推荐答案
Using np.where
In [4510]: df['group'] = np.where((df.height <= 400) & (df.width <= 300),
...: 'g1',
...: np.where((df.height <= 640) & (df.width <= 480),
...: 'g2',
...: np.where((df.height <= 800) & (df.width <= 600),
...: 'g3',
...: np.where((df.height > 800) & (df.width > 600),
...: 'g4',
...: 'others'))))
In [4511]: df
Out[4511]:
filename height width group
0 shopfronts_23092017_3_285.jpg 750.0 560.0 g3
1 shopfronts_200.jpg 4395.0 6020.0 g4
2 shopfronts_25092017_eateries_98.jpg 414.0 621.0 others
3 shopfronts_101.jpg 480.0 640.0 others
4 shopfronts_138.jpg 3733.0 8498.0 g4
5 shopfronts_25092017_eateries_95.jpg 187.0 250.0 g1
6 shopfronts_25092017_neon_33.jpg 100.0 200.0 g1
7 shopfronts_322.jpg 682.0 1024.0 others
8 shopfronts_171.jpg 800.0 600.0 g3
9 shopfronts_23092017_3_35.jpg 120.0 210.0 g1
这篇关于Python:基于 Pandas 中 2 列的分箱的文章就介绍到这了,希望我们推荐的答案对大家有所帮助,也希望大家多多支持IT屋!
查看全文