Pandas - 从列值创建列,并用计数填充 [英] Pandas - Create columns from column value, and fill with count
问题描述
我有一个类似于下面的数据框.
索引时间 工作日0 21:10:00 周二1 21:15:00 周二2 21:20:00 周二3 21:20:00 周二4 21:25:00 星期三5 21:25:00 星期三6 21:30:00 周五7 21:35:00 星期四8 21:35:00 星期三9 21:40:00 星期三10 21:40:00 星期三11 21:40:00 星期一
我想把工作日列成列,计算每天每次出现的次数,我的目标是这样的:
时间周一周二周三周四周五21:10:00 0 1 0 0 021:15:00 0 1 0 0 021:20:00 0 2 0 0 021:25:00 0 0 2 0 021:30:00 0 0 0 0 121:35:00 0 0 1 1 021:40:00 1 0 2 0 0
这样做的原因是因为我想在 seaborn 中创建一个热图,并且我读到我的数据必须以某种方式旋转/成形:
I have a dataframe similar to below.
Index Time Weekday
0 21:10:00 Tuesday
1 21:15:00 Tuesday
2 21:20:00 Tuesday
3 21:20:00 Tuesday
4 21:25:00 Wednesday
5 21:25:00 Wednesday
6 21:30:00 Friday
7 21:35:00 Thursday
8 21:35:00 Wednesday
9 21:40:00 Wednesday
10 21:40:00 Wednesday
11 21:40:00 Monday
I want to put the weekdays into columns, and count how many times each time appears for each day, my goal is this:
Time Monday Tuesday Wednesday Thursday Friday
21:10:00 0 1 0 0 0
21:15:00 0 1 0 0 0
21:20:00 0 2 0 0 0
21:25:00 0 0 2 0 0
21:30:00 0 0 0 0 1
21:35:00 0 0 1 1 0
21:40:00 1 0 2 0 0
The reason for this is because I want to create a heatmap in seaborn, and I read my data has to be pivoted/shaped a certain way: https://stackoverflow.com/a/37790707/9384889
I know how to count how frequent each Time
value appears, ignoring the weekday:
df['Time'].value_counts()
And I have been reading http://pandas.pydata.org/pandas-docs/stable/generated/pandas.DataFrame.pivot.html
But I cannot see how to combine these two ideas.
Use groupby
with size
and unstack
or crosstab
alternative for reshape.
For change order of days need ordered Categorical
or reindex
by columns:
cats = ['Monday','Tuesday','Wednesday','Thursday','Friday']
df['Weekday'] = pd.Categorical(df['Weekday'], categories=cats, ordered=True)
df = df.groupby(['Time', 'Weekday']).size().unstack(fill_value=0)
df = df.groupby(['Time', 'Weekday']).size().unstack(fill_value=0).reindex(columns=cats)
Alternatives:
df = pd.crosstab(df['Time'], pd.Categorical(df['Weekday'], categories=cats, ordered=True))
df = pd.crosstab(df['Time'], df['Weekday']).reindex(columns=cats)
print (df)
col_0 Monday Tuesday Wednesday Thursday Friday
Time
21:10:00 0 1 0 0 0
21:15:00 0 1 0 0 0
21:20:00 0 2 0 0 0
21:25:00 0 0 2 0 0
21:30:00 0 0 0 0 1
21:35:00 0 0 1 1 0
21:40:00 1 0 2 0 0
Last use seaborn.heatmap
:
import seaborn as sns
sns.heatmap(df, annot=True, fmt="g", cmap='viridis')
这篇关于Pandas - 从列值创建列,并用计数填充的文章就介绍到这了,希望我们推荐的答案对大家有所帮助,也希望大家多多支持IT屋!