在(pandas/Python)的数据框中转换系列,其中列是系列的级别 [英] Transform a Series in a dataframe (of pandas/Python) where the columns are the levels of the Series

查看:90
本文介绍了在(pandas/Python)的数据框中转换系列,其中列是系列的级别的处理方法,对大家解决问题具有一定的参考价值,需要的朋友们下面随着小编来一起学习吧!

问题描述

我正在与熊猫一起工作,并且使用了groupby:

I'm working with pandas and I used the groupby:

group = df_crimes_query.groupby(["CrimeDateTime", "WeaponFactor"]).size()
group.head(20)


CrimeDateTime  WeaponFactor
2016-01-01     FIREARM          11
               HANDS            26
               KNIFE             3
               OTHER            11
               UNDEFINED       102
2016-01-02     FIREARM          10
               HANDS            21
               KNIFE             8
               OTHER             6
               UNDEFINED        68
2016-01-03     FIREARM          12
               HANDS            13
               KNIFE             6
               OTHER             5
               UNDEFINED        73
2016-01-04     FIREARM          11
               HANDS            10
               KNIFE             1
               OTHER             3
               UNDEFINED        84
dtype: int64

它的类型是系列:

type(group)

pandas.core.series.Series

我想要一个这样的数据框:

I would like a dataframe about like this:

CrimeDateTime   FIREARM     HANDS   KNIFE   OTHER   UNDEFINED
2016-01-01      11          26      3       11      102
2016-01-02      10          21      8       6       68
2016-01-03      12          13      6       5       73
2016-01-04      11          10      1       3       84

我想使用此数据框来绘制五个时间序列,每个时间序列一个类型(FIREARM,HANDS等).我曾尝试过,但在网上搜索过,但是没有成功.

I would like to use this dataframe for I plot five time series after, one for each type (FIREARM, HANDS and etc.). I had tried, searched on web, however without success.

代码在我的GitHub中(在测试"部分中):

The code is in my GitHub (in section called Testing): https://github.com/rmmariano/CAP386_intro_data_science/blob/master/projeto/crimes_baltimore/crimes_baltimore.ipynb

我还有其他测试代码,但为了最清晰起见,我已将其删除.

I had others testing codes, but I had removed to be clearest.

有人有主意吗?

推荐答案

选项1
简单而缓慢

Option 1
Simple and slow

pd.crosstab(df.CrimeDateTime, df.WeaponFactor)

WeaponFactor   FIREARM  HANDS  KNIFE  OTHER  UNDEFINED
CrimeDateTime                                         
2016-01-01          11     26      3     11        102
2016-01-02          10     21      8      6         68
2016-01-03          12     13      6      5         73
2016-01-04          11     10      1      3         84


选项2
更快,更酷!


Option 2
Faster and Cool!

pd.get_dummies(df.CrimeDateTime).T.dot(pd.get_dummies(df.WeaponFactor))

            FIREARM  HANDS  KNIFE  OTHER  UNDEFINED
2016-01-01       11     26      3     11        102
2016-01-02       10     21      8      6         68
2016-01-03       12     13      6      5         73
2016-01-04       11     10      1      3         84


选项3
下一级功夫熊猫!


Option 3
Next Level Kung Fu Panda!

i, r = pd.factorize(df.CrimeDateTime.values)
j, c = pd.factorize(df.WeaponFactor.values)
n, m = r.size, c.size
b = np.bincount(j + i * m, minlength=n * m).reshape(n, m)

pd.DataFrame(b, r, c)

            FIREARM  HANDS  KNIFE  OTHER  UNDEFINED
2016-01-01       11     26      3     11        102
2016-01-02       10     21      8      6         68
2016-01-03       12     13      6      5         73
2016-01-04       11     10      1      3         84

这篇关于在(pandas/Python)的数据框中转换系列,其中列是系列的级别的文章就介绍到这了,希望我们推荐的答案对大家有所帮助,也希望大家多多支持IT屋!

查看全文
登录 关闭
扫码关注1秒登录
发送“验证码”获取 | 15天全站免登陆