pandas 按时间顺序重复行 [英] Pandas duplicate rows with time sequence
本文介绍了 pandas 按时间顺序重复行的处理方法,对大家解决问题具有一定的参考价值,需要的朋友们下面随着小编来一起学习吧!
问题描述
我正在尝试复制熊猫数据框的行,并且还为在列FROM
和TO
之间的分钟数内的时间序列添加了另一列.
I am trying to duplicate my pandas' data frame's rows and also adding an additional column for a time sequence in minutes between column FROM
and TO
.
例如,我有这个数据框.
For example, I have this data frame.
ID FROM TO
A 15:30 15:33
B 16:40 16:44
C 15:20 15:22
我想要的输出是
ID FROM TO time
A 15:30 15:33 15:30
A 15:30 15:33 15:31
A 15:30 15:33 15:32
A 15:30 15:33 15:33
B 16:40 16:41 16:40
B 16:40 16:41 16:41
C 15:20 15:22 15:20
C 15:20 15:22 15:21
C 15:20 15:22 15:22
在 R 中,我可以这样做:new_df = setDT(df)[, .(ID, FROM, TO, time=seq(FROM,TO,by="mins")), by=1:nrow(df)]
,但是我很难找到与之等效的Python.
In R, I could do this: new_df = setDT(df)[, .(ID, FROM, TO, time=seq(FROM,TO,by="mins")), by=1:nrow(df)]
, but I am having trouble finding the Python equivalent of this.
提前谢谢!
推荐答案
解决问题的两个步骤:
pd.date_range
和 apply
和 strftime
pd.date_range
with apply
and strftime
df['duration'] = df.apply(
lambda row: [
i.strftime('%H:%M')
for i in pd.date_range(
row['FROM'], row['TO'], freq='60s'
)
],
axis=1)
ID FROM TO duration
0 A 15:30 15:33 [15:30, 15:31, 15:32, 15:33]
1 B 16:40 16:44 [16:40, 16:41, 16:42, 16:43, 16:44]
2 C 15:20 15:22 [15:20, 15:21, 15:22]
apply
和 stack
apply
with stack
df.set_index(['ID', 'FROM', 'TO']) \
.duration.apply(pd.Series) \
.stack().reset_index(level=3, drop=True) \
.reset_index() \
.set_index('ID')
# Result
FROM TO 0
ID
A 15:30 15:33 15:30
A 15:30 15:33 15:31
A 15:30 15:33 15:32
A 15:30 15:33 15:33
B 16:40 16:44 16:40
B 16:40 16:44 16:41
B 16:40 16:44 16:42
B 16:40 16:44 16:43
B 16:40 16:44 16:44
C 15:20 15:22 15:20
C 15:20 15:22 15:21
C 15:20 15:22 15:22
这篇关于 pandas 按时间顺序重复行的文章就介绍到这了,希望我们推荐的答案对大家有所帮助,也希望大家多多支持IT屋!
查看全文