在data.table中嵌套ifelse与不同的列 [英] Nested ifelse with varying columns in data.table

查看：133 发布时间：2017/7/13 21:48:15 r dataframe data.table dplyr

本文介绍了在data.table中嵌套ifelse与不同的列的处理方法，对大家解决问题具有一定的参考价值，需要的朋友们下面随着小编来一起学习吧！

问题描述

我需要为 data.table 的某些列的每一行计算一个最佳值。每行的最佳值是所选列的给定顺序中的第一个非NA列的值。

I need to compute a "best value" for each row of some columns of a data.table. The best value for each row is the value of the first non-NA column in the given order of selected columns.

作为要求，包含的列可能会因订单或号码而异。此外，应为每一行存储列的名称。

As a requirement, the columns to include may vary by order or number. In addition, the name of the column giving the best value should be stored for each row.

library(data.table)
library(magrittr)
n <- 7
set.seed(1234)
dt <- sample.int(100, n*5, replace = TRUE) %>% 
  ifelse(. < 35, NA, .) %>% 
  matrix(, nrow = n) %>% 
  as.data.table()

样本 data.table 是

   V1 V2 V3 V4 V5
1: NA NA NA NA 84
2: 63 67 84 NA NA
3: 61 52 NA NA 46
4: 63 70 NA NA NA
5: 87 55 NA 82 NA
6: 65 NA NA 53 51
7: NA 93 NA 92 NA

要包含在给定顺序中的列是

The columns to be included in the given order are

selected_cols <- c("V3", "V4", "V1")

硬编码嵌套的预期结果 `ifelse`

硬编码版本

Expected result with hard-coded nested `ifelse`

The hardcoded version

dt[, best_value := ifelse(!is.na(V3), V3, ifelse(!is.na(V4), V4, V1))]

将给出最佳值的预期结果

will give the expected result for the best value

   V1 V2 V3 V4 V5 best_value
1: NA NA NA NA 84         NA
2: 63 67 84 NA NA         84
3: 61 52 NA NA 46         61
4: 63 70 NA NA NA         63
5: 87 55 NA 82 NA         82
6: 65 NA NA 53 51         53
7: NA 93 NA 92 NA         92

，但仍然没有显示从哪个列获取最佳值。

but it still doesn't show from which of the columns the best value was taken.

在行2列 V3 中已经有非NA值。对于行5,6和7，将使用列 V4 中的值。最后，列 V1 给出行3和4的值，其中 V3 和 V4 是NA。第1行包含NA，因为所有正在考虑的列都是NA。

In row 2 column V3 already has a non-NA value. For rows 5, 6, and 7, the values from column V4 are taken. Finally, column V1 gives the values for rows 3 and 4 where both V3 and V4 are NA. Row 1 contains a NA because all columns under consideration are NA.

使用循环选择的列和一些 data.table 功能

Using a for loop over the selected columns and some data.table features

dt[, best_value := NA_integer_]
dt[, best_col := NA_character_]
for (x in selected_cols) {
  dt[is.na(best_value), best_col := ifelse(!is.na(.SD), names(.SD), NA), .SDcols = x]
  dt[is.na(best_value), best_value:= .SD, .SDcols = x]
}

我们得到完整的预期结果

we get the full expected result

   V1 V2 V3 V4 V5 best_value best_col
1: NA NA NA NA 84         NA       NA
2: 63 67 84 NA NA         84       V3
3: 61 52 NA NA 46         61       V1
4: 63 70 NA NA NA         63       V1
5: 87 55 NA 82 NA         82       V4
6: 65 NA NA 53 51         53       V4
7: NA 93 NA 92 NA         92       V4

此外，可以轻松更改要包括的列的向量。

In addition, the vector of columns to be included can be changed easily.

然而使用两个语句循环的方法对我来说看起来相当笨拙，而不是非常 data.table 类似。


However, the approach with a for loop with two statements looks rather clumsy to me and not very data.table-like.
有没有更好的方法来实现这些结果与 data.table 或 dplyr 甚至在基地R？
Is there a better way to achieve these result with data.table or dplyr or even in base R?
推荐答案
使用'for'循环并利用列表  -   data.table 结构：
Working on your 'for' loop and taking advantage of the list - data.table structure:
ans_col = rep_len(NA_character_, nrow(dt))
ans_val = rep_len(NA_real_, nrow(dt))
for(col in selected_cols) {
    i = is.na(ans_col) & (!is.na(dt[[col]]))
    ans_col[i] = col
    ans_val[i] = dt[[col]][i]   
}
data.frame(ans_val, ans_col)
#  ans_val ans_col
#1      NA    <NA>
#2      84      V3
#3      61      V1
#4      63      V1
#5      82      V4
#6      53      V4
#7      92      V4


                        这篇关于在data.table中嵌套ifelse与不同的列的文章就介绍到这了，希望我们推荐的答案对大家有所帮助，也希望大家多多支持IT屋！


                    
                        查看全文

在data.table中嵌套ifelse与不同的列 [英] Nested ifelse with varying columns in data.table

问题描述

硬编码嵌套的预期结果 `ifelse`

Expected result with hard-coded nested `ifelse`

推荐答案

相关文章

其他开发语言最新文章

热门教程

热门工具

登录关闭

在data.table中嵌套ifelse与不同的列 [英] Nested ifelse with varying columns in data.table

问题描述

硬编码嵌套的预期结果 ifelse

Expected result with hard-coded nested ifelse

推荐答案

相关文章

其他开发语言最新文章

热门教程

热门工具

登录 关闭

硬编码嵌套的预期结果 `ifelse`

Expected result with hard-coded nested `ifelse`

登录关闭