如何使用if语句使用pandas添加新列? [英] How to use pandas to add new column using if statement?
本文介绍了如何使用if语句使用pandas添加新列?的处理方法,对大家解决问题具有一定的参考价值,需要的朋友们下面随着小编来一起学习吧!
问题描述
您能帮我在python pandas中编写以下概念吗,我有以下数据类型:
Could you kindly help me to write the following concept in python pandas, I have the following datatype:
id=["Train A","Train A","Train A","Train B","Train B","Train B"]
start = ["A","B","C","D","E","F"]
end = ["G","H","I","J","K","L"]
arrival_time = ["0"," 2016-05-19 13:50:00","2016-05-19 21:25:00","0","2016-05-24 18:30:00","2016-05-26 12:15:00"]
departure_time = ["2016-05-19 08:25:00","2016-05-19 16:00:00","2016-05-20 07:25:00","2016-05-24 12:50:00","2016-05-25 23:00:00","2016-05-26 19:45:00"]
capacity = ["2","2","3","3","2","3"]
要获取以下数据:
id arrival_time departure_time start end capacity
Train A 0 2016-05-19 08:25:00 A G 2
Train A 2016-05-19 13:50:00 2016-05-19 16:00:00 B H 2
Train A 2016-05-19 21:25:00 2016-05-20 07:25:00 C I 3
Train B 0 2016-05-24 12:50:00 D J 3
Train B 2016-05-24 18:30:00 2016-05-25 20:00:00 E K 2
Train B 2016-05-26 12:15:00 2016-05-26 19:45:00 F L 3
我想添加一列称为源和接收器,并且如果到达和离开之间的时间差小于3小时,则源是旅行的起点,而接收器仅在旅行中断时(即,当time_difference时)超过3小时,
I would like to add a column called source and sink and if the time difference between arrival and departure is less than 3 hours, the source is the starting of the trip and the sink is only when the trip breaks (ie when time_difference is more than 3 hours,
time difference source sink
- A H
02:10:00 A H
10:00:00 C I
- D K
01:30:00 D K
19:30:00 F L
推荐答案
df = df.assign(timediff=(df.departure_time - df.arrival_time))
df = df.assign(source = np.where(df.timediff.dt.seconds / 3600 < 3, df.shift(1).start, df.start))
df = df.assign(sink = np.where(df.timediff.dt.seconds.shift(1) / 3600 > 3, df.shift(-1).end, df.end))
print(df)
输出:
id arrival_time departure_time start end capacity sink \
0 Train A NaT 2016-05-19 08:25:00 A G 2 G
1 Train A 2016-05-19 13:50:00 2016-05-19 16:00:00 B H 2 H
2 Train A 2016-05-19 21:25:00 2016-05-20 07:25:00 C I 3 I
3 Train B NaT 2016-05-24 12:50:00 D J 3 K
4 Train B 2016-05-24 18:30:00 2016-05-25 20:00:00 E K 2 K
5 Train B 2016-05-26 12:15:00 2016-05-26 19:45:00 F L 3 L
timediff source
0 NaT A
1 0 days 02:10:00 A
2 0 days 10:00:00 C
3 NaT D
4 1 days 01:30:00 D
5 0 days 07:30:00 F
这篇关于如何使用if语句使用pandas添加新列?的文章就介绍到这了,希望我们推荐的答案对大家有所帮助,也希望大家多多支持IT屋!
查看全文