根据创建的列按两列分组 [英] group by two columns based on created column

查看:61
本文介绍了根据创建的列按两列分组的处理方法,对大家解决问题具有一定的参考价值,需要的朋友们下面随着小编来一起学习吧!

问题描述

我有一个这样的数据集

df = pd.DataFrame({'time':['13:30', '9:20', '18:12', '19:00', '11:20', '13:30', '15:20', '17:12', '16:00', '8:20'],
               'item': [coffee, bread, pizza, rice, soup, coffee, bread, pizza, rice, soup]})

我想将时间分为早餐,午餐,晚餐3顿饭.并将其添加到数据中

I want to split the time into 3 meal times breakfast, lunch, dinner. and add it to data

我是这样做的

df['hour'] = df.Time.apply(lambda x: int(x.split(':')[0]))
def time_period(hour):
if hour >= 6 and hour < 11:
    return 'breakfast'
elif hour >= 11 and hour < 15:
    return 'lunch'
else:
    return 'dinner'
df['meal'] = df['hour'].apply(lambda x: time_period(x))

现在我想基于这三餐数据对数据进行分组,并具有如下输出:

now I want to groupby the data based on these 3 meals and have an output like this:

推荐答案

df['time'] = df['time'].replace(r'[:]','.',regex=True).astype(float)
df['meal'] = pd.cut(df['time'],bins = [6,11,15,24],labels = ['breakfast','lunch','dinner'])
a = df.groupby(['meal','item']).size()
l = []
for i in np.sort(a.index.get_level_values(level=0).unique().tolist()):
    l.append(a.loc[i].reset_index().rename(columns = {0:'count'}))
b = pd.concat(l,axis=1)
c = [i for i in a.index.get_level_values(level=0).unique().tolist()*2]
c = np.sort(c)
b.columns = [c,b.columns]

这篇关于根据创建的列按两列分组的文章就介绍到这了,希望我们推荐的答案对大家有所帮助,也希望大家多多支持IT屋!

查看全文
登录 关闭
扫码关注1秒登录
发送“验证码”获取 | 15天全站免登陆