在一串字母内比较R中的值 [英] compare values in R when within a string of letters

查看:112
本文介绍了在一串字母内比较R中的值的处理方法,对大家解决问题具有一定的参考价值,需要的朋友们下面随着小编来一起学习吧!

问题描述

我需要检查两列的值是否满足某些条件,但是一列中的值是否在一个字母串中。

I need to check if the values of two columns fulfil certain conditions, but the value in one column is within a string of letters.

如果值<当 CURRENT_TEXT_1 CURRENT_TEXT_2 CURRENT_ID 等于 CURRENT_TEXT_1 CURRENT_TEXT_2 加2的值>等于DISPLAY_BOUNDARY,然后我需要在 OUTPUT 列中输入值1,否则值为零。

If the value of CURRENT_ID is equal to the value of CURRENT_TEXT_1 or CURRENT_TEXT_2 plus 2, when CURRENT_TEXT_1 or CURRENT_TEXT_2 are equal to DISPLAY_BOUNDARY, then I need in the OUTPUT column a value of 1, otherwise a value of zero.

以下是一些示例我的数据文件的行( df )和我想要获得的输出:

Here are some example lines of my datafile (df) and the output I would like to obtain:

 PARTICIPANT     ITEM   CONDITION      CURRENT_TEXT_1               CURRENT_TEXT_2                 CURRENT_ID            OUTPUT
 ppt01          1         1            DISPLAY_BOUNDARY 1 the       iaRegion 4 rd 0 x width 333    7                     0
 ppt01          3         1            iaRegion 2 rd 0 x width 1    DISPLAY_BOUNDARY 9 a           11                    1
 ppt01          4         2            DISPLAY_BOUNDARY 2 aware     iaRegion 6 rd 0 x width 768    3                     0
 ppt01          6         3            DISPLAY_BOUNDARY 3 door      iaRegion 8 rd 0 x width 534    4                     0
 ppt01          9         4            DISPLAY_BOUNDARY 6 in        iaRegion 9 rd 0 x width 924    5                     0
 ppt01          48        5            DISPLAY_BOUNDARY 6 the       iaRegion 10 rd 0 x width 712   8                     1
 ppt02          3         4            iaRegion 14 rd 0 x width 756 DISPLAY_BOUNDARY 15 put        17                    1
 ppt02          7         5            iaRegion 1 rd 0 x width 334  DISPLAY_BOUNDARY 1 where       3                     1
 ppt02          8         6            DISPLAY_BOUNDARY 3 At        iaRegion 2 rd 0 x width 215    5                     1
 ppt02          35        2            iaRegion 3 rd 0 x width 524  DISPLAY_BOUNDARY 1 outside     2                     0
 ppt03          10        1            iaRegion 11 rd 0 x width 190 DISPLAY_BOUNDARY 2 school      4                     1
 ppt03          56        1            DISPLAY_BOUNDARY 8 blue      iaRegion 11 red 0 x width 383  9                     0

我的尝试是:

df$OUTPUT <- ifelse(df$CURRENT_ID==((grepl("DISPLAY_BOUNDARY",df$CURRENT_TEXT_1)|grepl("DISPLAY_BOUNDARY",df$CURRENT_TEXT_2))+2, 1, 0)

但我不知道如何提取与DISPLAY_BOUNDARY相关的值。任何帮助将不胜感激。

But I don't know how to extract the value associated with DISPLAY_BOUNDARY. Any help would be appreciated.

推荐答案

这样的事情,也许......

Something like this, perhaps...

#extract any relevant numeric values from ct1 and ct2
ct1 <- as.numeric(gsub("DISPLAY_BOUNDARY ([0-9]+).*","\\1",df$CURRENT_TEXT_1))  
ct2 <- as.numeric(gsub("DISPLAY_BOUNDARY ([0-9]+).*","\\1",df$CURRENT_TEXT_2))

#use mapply to check each row and return logical value as numeric
df$OUTPUT <- as.numeric(mapply(function(id,x1,x2) id %in% c(x1+2,x2+2),
                               as.numeric(df$CURRENT_ID),ct1,ct2))

这篇关于在一串字母内比较R中的值的文章就介绍到这了,希望我们推荐的答案对大家有所帮助,也希望大家多多支持IT屋!

查看全文
登录 关闭
扫码关注1秒登录
发送“验证码”获取 | 15天全站免登陆