pandas :重塑数据 [英] Pandas: reshaping data

查看:71
本文介绍了 pandas :重塑数据的处理方法,对大家解决问题具有一定的参考价值,需要的朋友们下面随着小编来一起学习吧!

问题描述

我有一个熊猫系列,目前看起来像这样:

I have a pandas Series which presently looks like this:

14    [Yellow, Pizza, Restaurants]
...
160920                  [Automotive, Auto Parts & Supplies]
160921       [Lighting Fixtures & Equipment, Home Services]
160922                 [Food, Pizza, Candy Stores]
160923           [Hair Removal, Nail Salons, Beauty & Spas]
160924           [Hair Removal, Nail Salons, Beauty & Spas]

我想从根本上将其重塑成一个看起来像这样的数据框...

And I want to radically reshape it into a dataframe that looks something like this...

      Yellow  Automotive  Pizza
14       1         0        1
…           
160920   0         1        0
160921   0         0        0
160922   0         0        1
160923   0         0        0
160924   0         0        0

即.逻辑结构,指出每个观察(行)属于哪个类别.

ie. a logical construction noting which categories each observation(row) falls into.

我能够编写基于循环的代码来解决该问题,但是鉴于我需要处理的行数众多,这将非常缓慢.

I'm capable of writing for loop based code to tackle the problem, but given the large number of rows I need to handle, that's going to be very slow.

有人知道这种问题的矢量化解决方案吗?我将非常感谢.

Does anyone know a vectorised solution to this kind of problem? I'd be very grateful.

有509个类别,我确实有一个类别.

there are 509 categories, which I do have a list of.

推荐答案

In [9]: s = Series([list('ABC'),list('DEF'),list('ABEF')])

In [10]: s
Out[10]: 
0       [A, B, C]
1       [D, E, F]
2    [A, B, E, F]
dtype: object

In [11]: s.apply(lambda x: Series(1,index=x)).fillna(0)
Out[11]: 
   A  B  C  D  E  F
0  1  1  1  0  0  0
1  0  0  0  1  1  1
2  1  1  0  0  1  1

这篇关于 pandas :重塑数据的文章就介绍到这了,希望我们推荐的答案对大家有所帮助,也希望大家多多支持IT屋!

查看全文
登录 关闭
扫码关注1秒登录
发送“验证码”获取 | 15天全站免登陆