pandas :将列中的列表换行 [英] Pandas: Transpose a list in column into rows

查看:75
本文介绍了 pandas :将列中的列表换行的处理方法,对大家解决问题具有一定的参考价值,需要的朋友们下面随着小编来一起学习吧!

问题描述

所以我有如下所示的pandas数据帧df_dates.

So I have pandas dataframe df_dates as below.

   PERSON_ID   MIN_DATE   MAX_DATE
0  000099-48 2016-02-01 2017-03-20
1     000184 2016-02-05 2017-01-19
2  000461-48 2016-03-07 2017-03-20
3  000791-48 2016-02-01 2017-03-07
4  000986-48 2016-02-01 2017-03-17
5     001617 2016-02-01 2017-02-20
6  001768-48 2016-02-01 2017-03-20
7     001937 2016-02-01 2017-03-17
8  002223-48 2016-02-04 2017-03-16
9  002481-48 2016-02-05 2017-03-17

我正在尝试将最小"和最大"之间的所有日期添加为每个Person_ID的行.这是尝试过的.

I am trying to add all dates between the Min and Max as row each Person_ID. Here is what tried.

df_dates.groupby('PERSON_ID').apply(lambda x: pd.date_range(x['MIN_DATE'].values[0], x['MAX_DATE'].values[0]))

但是我得到的是什么方法可以将每个Person_ID的系列转置为行?或其他更好的方法呢?

But what I get with this is, is there any way to transpose that series into rows for each Person_ID? or any other better way of doing it?

PERSON_ID
0-L2ID        DatetimeIndex(['2016-08-05', '2016-08-06', '20...
0-LlID        DatetimeIndex(['2016-02-03', '2016-02-04', '20...
000099-48     DatetimeIndex(['2016-02-01', '2016-02-02', '20...
000184        DatetimeIndex(['2016-02-05', '2016-02-06', '20...
000276        DatetimeIndex(['2016-02-01', '2016-02-02', '20...
000461-48     DatetimeIndex(['2016-03-07', '2016-03-08', '20...
000493-48     DatetimeIndex(['2016-02-01', '2016-02-02', '20...
000615-48     DatetimeIndex(['2016-02-02', '2016-02-03', '20...
000791-48     DatetimeIndex(['2016-02-01', '2016-02-02', '20...
000986-48     DatetimeIndex(['2016-02-01', '2016-02-02', '20...
dtype: object

这是我想要达到的目标:

Here is what I am trying achieve:

PERSON_ID   Date
000099-48   2/1/2016
000099-48   2/2/2016
000099-48   2/3/2016
000099-48   2/4/2016
:
:
000099-48   3/18/2016
000099-48   3/19/2016
000099-48   3/20/2016
000184  2/5/2016
000184  2/6/2016
000184  2/7/2016
:
:
000184  1/17/2017
000184  1/18/2017
000184  1/19/2017

推荐答案

您可以使用

You can reshape using melt, then perform a groupby and resample:

# Reshape via melt to get in the proper format for a resample.
df = df.melt(id_vars=['PERSON_ID'], value_vars=['MIN_DATE', 'MAX_DATE'], value_name='DATE')

# Set the index and drop unnecessary columns.
df = df.set_index('DATE').drop('variable', axis=1)

# Perform a groupby and resample.
df = df.groupby('PERSON_ID', group_keys=False).resample('D').ffill().reset_index()

结果输出:

           DATE  PERSON_ID
0    2016-02-01  000099-48
1    2016-02-02  000099-48
2    2016-02-03  000099-48
3    2016-02-04  000099-48
...         ...        ...
3976 2017-03-14  002481-48
3977 2017-03-15  002481-48
3978 2017-03-16  002481-48
3979 2017-03-17  002481-48

这篇关于 pandas :将列中的列表换行的文章就介绍到这了,希望我们推荐的答案对大家有所帮助,也希望大家多多支持IT屋!

查看全文
登录 关闭
扫码关注1秒登录
发送“验证码”获取 | 15天全站免登陆