具有多个分类条件的 R 中的 Ifelse [英] Ifelse in R with multiple categorical conditions
问题描述
我有一个数据集 dt.train2
有 1500 个不同的观察值和 130 个变量.其中之一是languages
,它可以是english
、french
、arabic
...
I have a data set, dt.train2
with 1500 different observations and 130 variables. One of them is languages
and it can be english
, french
, arabic
...
我想创建一个 ifelse
字符串,它为 english
提供属性 1
,为 提供
,2
法语3
用于spanish
和0
用于其他任何内容.我不知道该怎么做.
I want to create a ifelse
string that gives attributes 1
for english
, 2
for french
, 3
for spanish
and 0
for anything else. I have no idea how to do it.
dt.train2[, language_string := ifelse(language == "english",
1,
ifelse(language == "french",
2,
ifelse(language == "spanish",3)]
我正在使用它来运行关于销售的线性模型.
I'm using this to run linear model about sales.
推荐答案
ifelse()
就快到了,你只需要最终的
You're almost there with the ifelse()
, you just need the final else
result (and a couple missing closing parentheses).
dt.train2[, language_string := ifelse(
language == "english", 1,
ifelse(language == "french", 2,
ifelse(language == "spanish", 3, 0)
)
)
]
<小时>
您可以通过其他几种方式来做到这一点:
A couple other ways you could do this:
制作查找表并加入:
# sample data
dt = data.table(language = c("english", "french", "spanish", "arabic", "chinese", "pig latin"))
lookup = data.table(language = c("english", "french", "spanish"),
language_string = c(1, 2, 3))
dt2 = merge(dt, lookup, by = "language", all.x = TRUE)
dt2[is.na(language_string), language_string := 0]
上述查找表方法可能是可扩展性最好的方法.但是,对于如此少量的编码,您也可以只设置它们中的每一个:
The above lookup table method is probably the nicest for scalability. However, for such a small number of encodings, you could also just set each of them:
# start with the default, 0
dt[, language_string := 0 ]
# then do each of the exceptions
dt[lanuage == "english", language_string := 1]
dt[language == "french", language_string := 2]
dt[language == "spanish", language_string := 3]
这篇关于具有多个分类条件的 R 中的 Ifelse的文章就介绍到这了,希望我们推荐的答案对大家有所帮助,也希望大家多多支持IT屋!