如何使用python获取一列数据中包含的特定行的时间差 [英] How to get time difference in specifc rows include in one column data using python
问题描述
这里我有一个包含时间和三个输入的数据集.在这里,我使用熊猫计算时差.
Here I have a dataset with time and three inputs. Here I calculate the time difference using panda.
代码是:
data['Time_different'] = pd.to_timedelta(data['time'].astype(str)).diff(-1).dt.total_seconds().div(60)
这是读取每一行的时间差.但是我想编写一个代码来仅查找具有 X3 值的特定行的时差.
This is reading the difference of time in each row. But I want to write a code for find the time difference only specific rows which are having X3 values.
我尝试使用 for 循环编写代码.但它不能正常工作.不用for循环能写代码吗?
I tried to write the code using for loop. But it's not working properly. Without using for loop can we write the code.?
正如你在我的图片中看到的,我有三个输入,X1、X2、X3.在这里,当我使用该代码时,它显示了 X1、X2、X3 的时间差.
As you can see in my image I have three inputs, X1,X2,X3. Here when I used that code it is showing the time difference of X1,X2,X3.
这里我想写的是获取具有值的 X3 输入的时间差.
Here what I want to write is getting the time difference for X3 inputs which are having a values.
time X3
6:00:00 0
7:00:00 2
8:00:00 0
9:00:00 50
10:00:00 0
11:00:00 0
12:00:00 0
13:45:00 0
15:00:00 0
16:00:00 0
17:00:00 0
18:00:00 0
19:00:00 20
那么这里我想跳过 X3 值为 0 的时间,只想读取 X3 值的时间差.
Then here I want to skip the time of having 0 values of X3 and want to read only time difference of values of X3.
time x3
7:00:00 2(values having)
9:00:00 50
所以时差是2hrs
然后第二个:
9:00:00 50
19:00:00 20
那么时差是10 hrs
同样,我想写代码或我的整个专栏.谁能帮我解决这个问题?
Like wise I want write the code or my whole column. Can anyone help me to solve this?
在放置代码时,会得到时间差为负值的错误.
While putting the code then get the error with time difference in minus value.
推荐答案
您可以尝试:
- 查找
X3
与 0 不同的行 - 使用<计算差异为小时代码>移位代码>
- 使用
更新数据框加入
:
完整示例:
data = """time X3
6:00:00 0
7:00:00 2
8:00:00 0
9:00:00 50
10:00:00 0
11:00:00 0
12:00:00 0
13:45:00 0
15:00:00 0
16:00:00 0
17:00:00 0
18:00:00 0
19:00:00 20"""
# Build dataframe from example
df = pd.read_csv(StringIO(data), sep=r'\s{1,}')
df['X1'] = np.random.randint(0,10,len(df)) # Add random values for "X1" column
df['X2'] = np.random.randint(0,10,len(df)) # Add random values for "X2" column
# Convert the time column to datetime object
df.time = pd.to_datetime(df.time, format="%H:%M:%S")
print(df)
# time X3 X1 X2
# 0 1900-01-01 06:00:00 0 5 4
# 1 1900-01-01 07:00:00 2 7 1
# 2 1900-01-01 08:00:00 0 2 8
# 3 1900-01-01 09:00:00 50 1 0
# 4 1900-01-01 10:00:00 0 3 9
# 5 1900-01-01 11:00:00 0 8 4
# 6 1900-01-01 12:00:00 0 0 2
# 7 1900-01-01 13:45:00 0 5 0
# 8 1900-01-01 15:00:00 0 5 7
# 9 1900-01-01 16:00:00 0 0 8
# 10 1900-01-01 17:00:00 0 6 7
# 11 1900-01-01 18:00:00 0 1 5
# 12 1900-01-01 19:00:00 20 4 7
# Compute difference
sub_df = df[df.X3 != 0]
out_values = (sub_df.time.dt.hour - sub_df.shift().time.dt.hour) \
.to_frame() \
.fillna(sub_df.time.dt.hour.iloc[0]) \
.rename(columns={'time': 'out'}) # Rename column
print(out_values)
# out
# 1 7.0
# 3 2.0
# 12 10.0
df = df.join(out_values) # Add out values
print(df)
# time X3 X1 X2 out
# 0 1900-01-01 06:00:00 0 2 9 NaN
# 1 1900-01-01 07:00:00 2 7 4 7.0
# 2 1900-01-01 08:00:00 0 6 6 NaN
# 3 1900-01-01 09:00:00 50 9 1 2.0
# 4 1900-01-01 10:00:00 0 2 9 NaN
# 5 1900-01-01 11:00:00 0 5 3 NaN
# 6 1900-01-01 12:00:00 0 6 4 NaN
# 7 1900-01-01 13:45:00 0 9 3 NaN
# 8 1900-01-01 15:00:00 0 3 0 NaN
# 9 1900-01-01 16:00:00 0 1 8 NaN
# 10 1900-01-01 17:00:00 0 7 5 NaN
# 11 1900-01-01 18:00:00 0 6 7 NaN
# 12 1900-01-01 19:00:00 20 1 5 10.0
这里使用 .fillna(sub_df.time.dt.hour.iloc[0])
用匹配的小时数替换第一个值(因为减去 0 什么也不做).您可以为 fillna()
中的值定义自己的规则.
Here is use .fillna(sub_df.time.dt.hour.iloc[0])
to replace the first values with the matching hours (since the subtract 0 does nothing). You can define your own rule for the value in fillna()
.
这篇关于如何使用python获取一列数据中包含的特定行的时间差的文章就介绍到这了,希望我们推荐的答案对大家有所帮助,也希望大家多多支持IT屋!