如何为列"start_date"中的每一行创建pandas.date_range()?和"end_date"列? [英] How to create pandas.date_range() for each row from column "start_date" and column "end_date"?
问题描述
我有一个df,例如:
id | start_date | end_date | price
1 | 2020-10-01 | 2020-10-3 | 1
1 | 2020-10-03 | 2020-10-4 | 1
2 | 2020-10-04 | 2020-10-6 | 2
3 | 2020-10-05 | 2020-10-5 | 3
列开始日期";和"end_date"是datetime64 [ns].
Columns "start_date" and "end_date" are datetime64[ns].
我想创建一个日期"日期范围内的列.
I want to create a "date" column from the date range.
最简单的方法是创建一个pandas.date_range(开始日期,结束日期,freq ="D"),然后使用.explode().
Easiest way is creating a pandas.date_range(start_date, end_date, freq="D"), then using .explode().
最终结果应如下:
id | start_date | end_date | price | date
1 | 2020-10-01 | 2020-10-3 | 1 | 2020-10-01
1 | 2020-10-01 | 2020-10-3 | 1 | 2020-10-02
1 | 2020-10-01 | 2020-10-3 | 1 | 2020-10-03
1 | 2020-10-03 | 2020-10-4 | 1 | 2020-10-03
1 | 2020-10-03 | 2020-10-4 | 1 | 2020-10-04
2 | 2020-10-04 | 2020-10-6 | 2 | 2020-10-04
2 | 2020-10-04 | 2020-10-6 | 2 | 2020-10-05
2 | 2020-10-04 | 2020-10-6 | 2 | 2020-10-06
3 | 2020-10-05 | 2020-10-5 | 3 | 2020-10-05
到目前为止已尝试:
df["daterange"] = pd.date_range(df["start_date"], df["end_date"])
TypeError:无法转换输入[0 2020-10-012020年10月1日
TypeError: Cannot convert input [0 2020-10-01 1 2020-10-01
df.itertuples()中行的
for row in df.itertuples():
df["daterange"] = pd.date_range(start=row.start_date, end=row.end_date)
ValueError:值(3)的长度与索引(9)的长度不匹配
ValueError: Length of values (3) does not match length of index (9)
Lambda,应用,融合等对于我的数据框大小来说太慢了,无法使用!
/编辑
到目前为止我发现的快速复制方法:
Fastet method I found so far:
https://github.com/Garve/scikit-bonus
skbonus.pandas.preprocessing.DateTimeExploder(
"date",
start_column="start_date",
end_column="end_date",
frequency="d",
drop=False,
)
推荐答案
到目前为止我发现的快速方法:
Fasted method I've found so far:
https://github.com/Garve/scikit-bonus
from skbonus.pandas.preprocessing import DateTimeExploder
df = DateTimeExploder(
"date",
start_column="start_date",
end_column="end_date",
frequency="d",
drop=False,
)
这篇关于如何为列"start_date"中的每一行创建pandas.date_range()?和"end_date"列?的文章就介绍到这了,希望我们推荐的答案对大家有所帮助,也希望大家多多支持IT屋!