如何格式化每周基于彼此的python中的列日期? [英] How to format columns dates in python that they are weekly based on eachother?

查看:37
本文介绍了如何格式化每周基于彼此的python中的列日期?的处理方法,对大家解决问题具有一定的参考价值,需要的朋友们下面随着小编来一起学习吧!

问题描述

我有一个数据框 df 看起来与此类似:

I have a dataframe df that looks similar to this:

identity      Start        End     week
  E         6/18/2020   7/2/2020    1
  E         6/18/2020   7/2/2020    2
 2D         7/18/2020   8/1/2020    1
 2D         7/18/2020   8/1/2020    2
 A1          9/6/2020   9/20/2020   1
 A1          9/6/2020   9/20/2020   2

问题是,当我提取数据时,替换的每个身份我都只有开始日期和结束日期,但是我有按周记录的数据,所有身份都具有相同的周数,有时所有身份可以包含5或6周,但它们始终相同.我想让Stata每周结束一次,所以当第一周结束时我要增加7天.当一周再次开始时,它从一周结束的地方开始.一个表示将是

The problem is that when I extracted the data I only had Start date and End date for every identity it replaced, but I have the data by weeks all identitys have the same amount of weeks some times all identitys can have 5 or 6 weeks but they are always the same. I want to make Stata and end be weekly so when the first week end I add 7 days. And when the week starts again it starts where week ended. A representation would be

identity      Start        End     week
   E       6/18/2020    6/25/2020   1
   E       6/25/2020    7/2/2020    2
  2D       7/18/2020    7/25/2020   1
  2D       7/25/2020    8/1/2020    2
  A1        9/6/2020    9/13/2020   1
  A1       9/13/2020    9/20/2020   2

我尝试了一个简单的方法,该方法创建了一个Sevens列,并进行加总运算以得到一周的结束时间,并且错误不再支持使用Timestamp对整数和整数数组进行加减.不要使用n * obj.freq 来加/减n然后我会从负七点开始,但是我不知道如何解决这个问题.任何帮助都将是巨大的.

I tried a simple method that was creating a sevens column and making the sum to get the end of the week I get and error Addition/subtraction of integers and integer-arrays with Timestamp is no longer supported. Instead of adding/subtracting n, use n * obj.freq Then I would concat start over minus seven but I don't know how to get around this problem. Any help would be magnificent.

推荐答案

类似于您的其他问题:

首先转换为日期时间:

df.loc[:, ["Start", "End"]] = (df.loc[:, ["Start", "End"]]
                                 .transform(pd.to_datetime, format="%m/%d/%Y"))

df

identity    Start   End     week
0   E   2020-06-18  2020-07-02  1
1   E   2020-06-18  2020-07-02  2
2   2D  2020-07-18  2020-08-01  1
3   2D  2020-07-18  2020-08-01  2
4   A1  2020-09-06  2020-09-20  1
5   A1  2020-09-06  2020-09-20  2

您的身份分为两组,因此从date_range中选择日期时会用到它:

Your identity is in groups of two, so I'll use that when selecting dates from the date_range:

 from itertools import chain

result = df.drop_duplicates(subset="identity")

date_range = (
    pd.date_range(start, end, freq="7D")[:2]
    for start, end in zip(result.Start, result.End)
)

date_range = chain.from_iterable(date_range)
End = lambda df: df.Start.add(pd.Timedelta("7 days"))

创建新的数据框:

df.assign(Start=list(date_range), End=End)

    identity    Start   End     week
0   E   2020-06-18  2020-06-25  1
1   E   2020-06-25  2020-07-02  2
2   2D  2020-07-18  2020-07-25  1
3   2D  2020-07-25  2020-08-01  2
4   A1  2020-09-06  2020-09-13  1
5   A1  2020-09-13  2020-09-20  2

这篇关于如何格式化每周基于彼此的python中的列日期?的文章就介绍到这了,希望我们推荐的答案对大家有所帮助,也希望大家多多支持IT屋!

查看全文
登录 关闭
扫码关注1秒登录
发送“验证码”获取 | 15天全站免登陆