循环浏览R中的数据框并测量两个值之间的时间差 [英] Loop through dataframe in R and measure time difference between two values

查看:63
本文介绍了循环浏览R中的数据框并测量两个值之间的时间差的处理方法,对大家解决问题具有一定的参考价值,需要的朋友们下面随着小编来一起学习吧!

问题描述

摘要:我正在分析发生的刺激(A& B)与用户可能的反应之间的时间差.

Summary: I am analyzing the time difference between an occured stimuli (A&B) and a possible response of the user.

数据集具有以下结构:

    structure(list(User = c("005b98f3-5b1b-4d10-bdea-a55d012b2844",
"005b98f3-5b1b-4d10-bdea-a55d012b2844", "005b98f3-5b1b-4d10-bdea-a55d012b2844", 
"005b98f3-5b1b-4d10-bdea-a55d012b2844", "005b98f3-5b1b-4d10-bdea-a55d012b2844", 
"005b98f3-5b1b-4d10-bdea-a55d012b2844", "005b98f3-5b1b-4d10-bdea-a55d012b2844", 
"005b98f3-5b1b-4d10-bdea-a55d012b2844", "005b98f3-5b1b-4d10-bdea-a55d012b2844", 
"005b98f3-5b1b-4d10-bdea-a55d012b2844", "005b98f3-5b1b-4d10-bdea-a55d012b2844", 
"005b98f3-5b1b-4d10-bdea-a55d012b2844", "005b98f3-5b1b-4d10-bdea-a55d012b2844", 
"005b98f3-5b1b-4d10-bdea-a55d012b2844", "005b98f3-5b1b-4d10-bdea-a55d012b2844", 
"005b98f3-5b1b-4d10-bdea-a55d012b2844", "005b98f3-5b1b-4d10-bdea-a55d012b2844", 
"005b98f3-5b1b-4d10-bdea-a55d012b2844", "005b98f3-5b1b-4d10-bdea-a55d012b2844", 
"005b98f3-5b1b-4d10-bdea-a55d012b2844"), Date = c("25.11.2015 13:59", 
"03.12.2015 09:32", "07.12.2015 08:18", "08.12.2015 19:40", "08.12.2015 19:40", 
"22.12.2015 08:52", "22.12.2015 08:50", "22.12.2015 15:42", "22.12.2015 20:46", 
"05.01.2016 11:33", "05.01.2016 11:35", "05.01.2016 13:22", "05.01.2016 13:21", 
"05.01.2016 13:22", "06.01.2016 09:18", "14.02.2016 22:47", "20.02.2016 21:27", 
"01.04.2016 13:52", "24.07.2016 07:03", "04.08.2016 08:25"), 
    Hour = c(1645L, 1833L, 1928L, 1963L, 1963L, 2288L, 2288L, 
    2295L, 2300L, 2627L, 2627L, 2629L, 2629L, 2629L, 2649L, 3598L, 
    3741L, 4717L, 7447L, 7712L), StimuliA = c(1L, 0L, 1L, 0L, 
    0L, 0L, 0L, 0L, 0L, 0L, 0L, 0L, 0L, 0L, 1L, 0L, 0L, 0L, 0L, 
    0L), StimuliB = c(0L, 0L, 0L, 0L, 1L, 0L, 0L, 0L, 1L, 1L, 
    1L, 0L, 0L, 0L, 0L, 0L, 0L, 0L, 0L, 1L), Responses = c(0L, 
    1L, 0L, 1L, 0L, 1L, 1L, 1L, 0L, 0L, 0L, 1L, 1L, 1L, 0L, 1L, 
    1L, 1L, 1L, 0L)), .Names = c("User", "Date", "Hour", "StimuliA", 
"StimuliB", "Responses"), row.names = c(NA, -20L), class = c("tbl_df", 
"tbl", "data.frame"))

有关数据的其他信息:数据表中的每一行都是一个事件日志,用户在该日志中要么感知到某个刺激,要么执行了某个动作(响应).小时:事件发生后,自项目开始以来的小时".

Additional Information on the data: Every row in the datatable is an event log where a User either perceived a certain Stimuli or performed an action (Response). Hour: The "Hour" since the start of the project, when the event occured.

目标:总体目标是衡量刺激与反应之间的时间. (如果有的话)我想创建一个循环,遍历每个用户的数据集,如果Stimuli的值是1,它将检查以后是否有用户的响应,并使用值创建一个向量A代表B,一个代表B.

Goal: The overall goal is to measure the time between an the stimuli and the response. (if there was one) I would like to create a loop which goes through the dataset for every User and if the value of a Stimuli is 1, it checks whether later there is a response of the user and the creates a vector with the values for A and one for B.

问题: 我是否可以通过for循环来做到这一点,该循环遍历每个用户并检查感知到的刺激,如果值1,则检查同一用户ID是否在最近的Response中具有值1,然后比较2个日期?

Question: Would i do this with a for loop, which goes through every User and checks the perceived Stimuli and if there is the value 1 checks whether the same User ID has the value 1 in the closest Response and then compares the 2 dates?

子问题//我正在苦苦挣扎的事情

  1. 我如何真正遍历每一行并检查条件语句,如果为TRUE,则执行命令? (如果别的?).
  2. 然后我将如何作为命令在该行中保存另一个单元格的值?
  3. 然后告诉R查找相同用户ID(按时间顺序)的最接近的响应,并计算这两个值之间的时间差?
  4. 最后使用这些计算出的值创建向量

所需结果:

Stimuli A c=(11253, 2122, 56969), Stimuli B c=(19512,107)

到目前为止,我自己编写的代码不是很有帮助.我正在尝试for循环和if语句,还尝试了ifelse函数.

My own code i produced so far is not very helpful. I was experimenting with for loops and if statements, but also the ifelse function.

我是R的新手,但是在datacamp上做了多个课程,但是我仍在努力将其应用于我自己的硕士论文.感谢您的所有帮助.

I am a newbie with R, but did multiple classes on datacamp, but still I am struggling to apply it to my own work of my master thesis. Thanks for all the help.

其他信息:

R version 3.4.0 (2017-04-21)
Platform: x86_64-w64-mingw32/x64 (64-bit)
Running under: Windows >= 8 x64 (build 9200)

推荐答案

在这里,您可以使用dplyr做到这一点.首先,您需要将Date列转换为POSIXct对象.然后,确保Date对象与arrange一起排序.然后,使用mutate添加一个时差列.然后,您可以filter对于刺激A或B为1且后跟响应为1的行.

Here's how you can do that with dplyr. First, you need to transform your Date column to a POSIXct object. Then, make sure the Date object is ordered with arrange. You then add a time difference column using mutate. You can then filter for rows where Stimuli A or B is 1 and is followed by a Response equal to 1.

df$Date <- as.POSIXct(strptime(df$Date,"%d.%m.%Y %H:%M"))
df %>%
  arrange(User,Date)%>%
  mutate(difftime= difftime(lead(Date),Date, units = "mins") ) %>%
  group_by(User)%>%
  filter((StimuliA==1 | StimuliB==1) & lead(Responses)==1)

                                  User                Date  Hour StimuliA StimuliB Responses   difftime
                                 <chr>              <dttm> <int>    <int>    <int>     <int>     <time>
1 005b98f3-5b1b-4d10-bdea-a55d012b2844 2015-11-25 13:59:00  1645        1        0         0 11253 mins
2 005b98f3-5b1b-4d10-bdea-a55d012b2844 2015-12-07 08:18:00  1928        1        0         0  2122 mins
3 005b98f3-5b1b-4d10-bdea-a55d012b2844 2015-12-08 19:40:00  1963        0        1         0 19510 mins
4 005b98f3-5b1b-4d10-bdea-a55d012b2844 2016-01-05 11:35:00  2627        0        1         0   106 mins
5 005b98f3-5b1b-4d10-bdea-a55d012b2844 2016-01-06 09:18:00  2649        1        0         0 56969 mins

这篇关于循环浏览R中的数据框并测量两个值之间的时间差的文章就介绍到这了,希望我们推荐的答案对大家有所帮助,也希望大家多多支持IT屋!

查看全文
登录 关闭
扫码关注1秒登录
发送“验证码”获取 | 15天全站免登陆