基于来自其他列的条件,将数值替换为NA: [英] Replace a numerical value by NA based on conditions from other columns:

查看:184
本文介绍了基于来自其他列的条件,将数值替换为NA:的处理方法,对大家解决问题具有一定的参考价值,需要的朋友们下面随着小编来一起学习吧!

问题描述

我是新的data.table包,请execuse我的简单的问题。我有一个看起来像DT

I am new to data.table package, please execuse my simple question. I have a data set that looks like DT

DT <- data.table(a = sample(c("C","M","Y","K"),  100, rep=TRUE),
                   b = sample(c("A","S"),  100, rep=TRUE),
                   f = round(rnorm(n=100, mean=.90, sd=.08),digits = 2) ); DT

如果满足某个条件,我想用NA替换列f中的任何值。例如对于 0.85> f。 0.90 我会有以下条件:

I would like to replace any value in column f with NA if it meets a certain condition. For example for 0.85 > f > 0.90 I would have the following condition:

DT$a == "C" & DT$b == "S" & DT$f < .85| DT$a == "C" & DT$b == "S" & DT$f >.90



我还想为每个分类

I would also like to have a different condition for each of the categorical variables in columns a and b.

推荐答案

使用您提到的条件,但没有 DT $ 会将满足条件的条目的 data.table 子集,然后可以使用 j := 运算符,通过引用字段将NA值分配给 f 也就是说,

Using the condition you've stated, but without the DT$ will subset your data.table for those entries that satisfy the condition, then you can use the j field to assign NA value to f by reference using := operator. That is,

DT[a == "C" & b == "S" & f < .85 | a == "C" & b == "S" & f >.90, f := NA]
which(is.na(DT$f))
# [1]  3 16 31 89

编辑:OP的注释和@ Joshua的好建议:

after OP's comment and @Joshua's nice suggestion:

`%between%` <- function(x, vals) { x >= vals[1] & x <= vals[2]}
`%nbetween%` <- Negate(`%between%`)
DT[a %in% c("C", "M", "Y", "K") & b == "S" & f %nbetween% c(0.85, 0.90), f := NA]

之间的%的否定<%n between%将给出期望的结果(f <0.85和f> 0.90)。还要注意使用%in%来检查 a

%nbetween% which is the negation of the %between% will give the desired result (f < 0.85 and f > 0.90). Also note the use of %in% to check for multiple values of a

编辑2:在OP完全重写之后,恐怕没有太多可以做的,除了b ==A,b ==S。

Edit 2: Following OP's complete re-write, I'm afraid there's not much you can do, except group b == "A", b == "S".

`%nbetween%` <- Negate(`%between%`)
DT[a == "M" & b %in% c("A", "S") & f %nbetween% c(.85, .90), f := NA]
DT[a == "Y" & b %in% c("A", "S") & f %nbetween% c(.95, .90), f := NA]
DT[a == "K" & b %in% c("A", "S") & f %nbetween% c(.95, 1.10), f := NA]

这篇关于基于来自其他列的条件,将数值替换为NA:的文章就介绍到这了,希望我们推荐的答案对大家有所帮助,也希望大家多多支持IT屋!

查看全文
登录 关闭
扫码关注1秒登录
发送“验证码”获取 | 15天全站免登陆