pandas -如何根据日期组织数据框并为列分配新值 [英] pandas - how to organised dataframe based on date and assign new values to column

查看:59
本文介绍了 pandas -如何根据日期组织数据框并为列分配新值的处理方法,对大家解决问题具有一定的参考价值,需要的朋友们下面随着小编来一起学习吧!

问题描述

我有一个数据框,不包括星期六和星期日,该数据框每1分钟记录一次.

I have a dataframe of a month excluding Saturday and Sunday, which was logged every 1 minute.

                            v1         v2  
2017-04-03 09:15:00     35.7       35.4  
2017-04-03 09:16:00     28.7       28.5
      ...               ...        ...
2017-04-03 16:29:00     81.7       81.5
2017-04-03 16:30:00     82.7       82.6
      ...               ...        ...
2017-04-04 09:15:00     24.3       24.2  
2017-04-04 09:16:00     25.6       25.5
      ...               ...        ...
2017-04-04 16:29:00     67.0       67.2
2017-04-04 16:30:00     70.2       70.6
      ...               ...        ...
2017-04-28 09:15:00     31.7       31.4  
2017-04-28 09:16:00     31.5       31.0
      ...               ...        ...
2017-04-28 16:29:00     33.2       33.5
2017-04-28 16:30:00     33.0       30.7

我对数据框进行了重新采样,以获取每天的第一个和最后一个值.

I have resample dataframe to get 1st and last value from each day.

res = df.groupby(df.index.date).apply(lambda x: x.iloc[[0, -1]])
res.index = res.index.droplevel(0)
print(res)
                      v1    v2
2017-04-03 09:15:00  35.7  35.4
2017-04-03 16:30:00  82.7  82.6
2017-04-04 09:15:00  24.3  24.2
2017-04-04 16:30:00  70.2  70.6
   ...                ..    ..
2017-04-28 09:15:00  31.7  31.4
2017-04-28 16:30:00  33.0  30.7

现在我想将数据帧组织为日期,并将最小时间戳的v1和特定日期的最大时间戳的v2进行组织.

Now i want to have the data-frame organised as date with v1 of minimum timestamp and v2 of max timestamp of specific date.

所需的输出:

              v1    v2
2017-04-03  35.7  82.6
2017-04-04  24.3  70.6
   ...       ..    ..
2017-04-28  31.7  30.7

推荐答案

您可以对索引进行分组,并使用

You can groupby on index and use groupby.agg with a custom function.

df1 = res.groupby(res.index.date).agg({'v1': lambda x: x[min(x.index)], 'v2':lambda x: x[max(x.index)]})

print (df1)

             v1      v2
2017-04-03  35.7    82.6
2017-04-04  24.3    70.6
2017-04-28  31.7    33.7

重新采样数据框以获取每天的第一个和最后一个值的替代方法.

An alternative to resample dataframe to get 1st and last value from each day.

res=df.reset_index().groupby(df.index.date).agg(['first','last']).stack().set_index('index')

Out[123]:

                      v1     v2
index       
2017-04-03 09:15:00  35.7   35.4
2017-04-03 16:30:00  82.7   82.6
2017-04-04 09:15:00  24.3   24.2
2017-04-04 16:30:00  70.2   70.6
2017-04-28 09:15:00  31.7   31.4
2017-04-28 16:30:00  33.0   33.7

这篇关于 pandas -如何根据日期组织数据框并为列分配新值的文章就介绍到这了,希望我们推荐的答案对大家有所帮助,也希望大家多多支持IT屋!

查看全文
登录 关闭
扫码关注1秒登录
发送“验证码”获取 | 15天全站免登陆