查找值是否在其他列的范围内 [英] Finding if a value is within the range of other columns

查看:69
本文介绍了查找值是否在其他列的范围内的处理方法,对大家解决问题具有一定的参考价值,需要的朋友们下面随着小编来一起学习吧!

问题描述

我有一个数据框 df 看起来像这样:

I have a dataframe df which looks like this:

df <- read.table(text = 

"ID  Q1_PM Q1_TP Q1_overall  Q2_PM  Q2_LS  Q2_overall
 1   1     2     3           1       2     2       
 2   0     NA    NA          2       1     1 
 3   2     1     1           3       4     0  
 4   1     0     2           4       0     2 
 5   NA    1     NA          0       NA    0  
 6   2     0     1           1       NA    NA"   

, header = TRUE)

期望输出:

进一步解释一下,我想要的输出如下:

Desired Output:

To explain a little further, my desired output is as below:

 ID  Q1_PM Q1_TP Q1_overall  Q2_PM  Q2_LS  Q2_overall Q1_check  Q2_check
 1   1     2     3           1       2     2          "above"   "within"
 2   0     NA    NA          2       1     1           NA       "within"
 3   2     1     1           3       4     0          "within"  "below"
 4   1     0     2           4       0     2          "above"   "within"
 5   NA    1     NA          0       NA    0           NA       "within"
 6   2     0     1           1       NA    NA         "within"   NA

说明:

示例 1:

根据Q1_PMQ1_TP列中的值,我想看看Q1_overall列中的值是否in 他们的范围与否?如果不在范围内,值是高于还是低于范围?为了跟踪这一点,我想添加一个额外的列 Q1_check.

Explanation:

Example 1:

Based on the value in columns Q1_PM and Q1_TP, I want to see whether the value in column Q1_overall is within their range or not? If, not in range, is the value above or below the range? To track this, I want to add an additional column Q1_check.

同样,根据Q2_PMQ2_LS的值,我想检查Q2_overall的值是否 他们的范围与否?如果不在范围内,是高于还是低于范围?同样,为了跟踪这一点,我想添加一个额外的列 Q2_check

Similarly, based on the values of Q2_PM and Q2_LS, I want to check if the value of Q2_overall is within their range or not? If not in range, is it above or below the range? Again, to track this, I want to add an additional column Q2_check

1- 为此,我想添加额外的列 Q1_checkQ2_check,其中第一列用于涉及 Q1 项目的比较第二列是涉及Q2项的比较.

1- For this, I want to add additional columns Q1_check and Q2_check where the first column is for the comparisons that involve Q1 items and the second column is for the comparisons that involve Q2 items.

2- 列可以包含以下值:abovebelowwithin.

2- The columns could contain the following values: above, below and within.

3- 当名为 overall 的列有 NAs 时,额外的列也可能有 NAs.

3- The case when the columns named overall have NAs, then the extra columns could also have NAs.

我找过相关的帖子,例如:将具有取决于另一列的值的列添加到数据框通过将数字列与固定值但我遇到了如下所述的错误.

I have looked for related posts such as: Add column with values depending on another column to a dataframe and Create categories by comparing a numeric column with a fixed value but I am running into errors as discussed below.

我能想到的唯一解决方案是:

The only solution, I can think of is, along these lines:

df$Q1_check <- ifelse(data$Q1_overall < data$Q1_PM, 'below',
                        ifelse(data$Q1_overall > data$Q1_TP, 'above', 
                               ifelse(is.na(data$Q1_overall), NA, 'within')))

但它会导致以下错误:Error in data$Q1_overall : object of type 'closure' is not subsettable.我不明白可能的问题是什么.

But it results in following error: Error in data$Q1_overall : object of type 'closure' is not subsettable. I do not understand what the possible issue could be.

df %>%
  mutate(Regulation = case_when(Q1_overall < Q1_PM ~ 'below',
                                Q1_overall > Q1_TP ~ 'above', 
                                Q1_PM < Q1_overall < Q1_TP, 'within'))

这也会导致错误 Error: unexpected '<'在:"Q1_overall>Q1_TP ~ '上面', Q1_PM <Q1_整体<"

如果(假设)列是这些,如何扩展解决方案:

How can the solution be extended if (let's say) the columns are these:

"Q1 Comm - 01 Scope Thesis"
"Q1 Comm - 02 Scope Project"
"Q1 Comm - 03 Learn Intern"
"Q1 Comm - 04 Biography"
"Q1 Comm - 05 Exhibit"
"Q1 Comm - 06 Social Act"
"Q1 Comm - 07 Post Project"
"Q1 Comm - 08 Learn Plant"
"Q1 Comm - 09 Study Narrate"
"Q1 Comm - 10 Learn Participate"
"Q1 Comm - 11 Write 1"
"Q1 Comm - 12 Read 2"
"Q1 Comm - Overall Study Plan"

我们如何识别Q1 Comm-Overall Study Plan列是什么时候:

How can we identify when the column Q1 Comm - Overall Study Plan is:

1 - Below 所有其他列的 min(),或

1 - Below the min() of all the other columns, or

2 - Above 所有其他列的 max(),或

2 - Above the max() of all the other columns, or

3 - 所有其他列的范围内

3 - Within the range of all the other columns

对于更新的字段,我还包括 dput(df)

For the updated fields, I am also including the dput(df)

dput(df)

structure(list(ï..ID = c(10L, 31L, 225L, 243L), Q1.Comm...01.Scope.Thesis = c(NA, 
2L, 0L, NA), Q1.Comm...02.Scope.Project = c(NA, NA, NA, 2L), 
    Q1.Comm...03.Learn.Intern = c(4L, NA, NA, NA), Q1.Comm...04.Biography = c(NA, 
    NA, NA, 1L), Q1.Comm...05.Exhibit = c(4L, 2L, NA, NA), Q1.Comm...06.Social.Act = c(NA, 
    NA, NA, 3L), Q1.Comm...07.Post.Project = c(NA, NA, 3L, NA
    ), Q1.Comm...08.Learn.Plant = c(NA, NA, NA, 4L), Q1.Comm...09.Study.Narrate = c(NA, 
    NA, 0L, NA), Q1.Comm...10.Learn.Participate = c(4L, NA, NA, 
    NA), Q1.Comm...11.Write.1 = c(NA, 2L, NA, NA), Q1.Comm...12.Read.2 = c(NA, 
    NA, 1L, NA), Q1.Comm...Overall.Study.Plan = c(4L, 1L, 2L, 
    NA), X = c(NA, NA, NA, NA), X.1 = c(NA, NA, NA, NA), X.2 = c(NA, 
    NA, NA, NA)), class = "data.frame", row.names = c(NA, -4L
))

关于如何实现这一目标的任何建议将不胜感激.谢谢!

Any advice on how to achieve this would be greatly appreciated. Thank you!

推荐答案

comparison <- function(dt, group_cols, new_col, compare_col){
  
  dt[, 
     c("min", "max") := transpose(pmap(.SD, range, na.rm = TRUE)), .SDcols = group_cols
     ][,(new_col) := fcase(
       is.na(get(compare_col)), NA_character_,
       get(compare_col) < min, "below",
       get(compare_col) > max, "above",
       default = "within"
     )
     ][]
}

group_cols <- names(df) %>%
 str_subset("^Q[0-9]+") %>%
 str_subset("overall", negate = TRUE) %>%
 split(str_extract(., "^Q[0-9]+"))

new_cols <- names(group_cols) %>% str_c("_check")
compare_cols <- names(group_cols) %>% str_c("_overall")

setDT(df)

pwalk(list(group_cols, new_cols, compare_cols), ~comparison(df, ...))
df[, c("min", "max") := NULL]

这篇关于查找值是否在其他列的范围内的文章就介绍到这了,希望我们推荐的答案对大家有所帮助,也希望大家多多支持IT屋!

查看全文
登录 关闭
扫码关注1秒登录
发送“验证码”获取 | 15天全站免登陆