R中具有多个分类条件的Ifelse [英] Ifelse in R with multiple categorical conditions
问题描述
我有一个数据集 dt.train2
,其中包含1500个不同的观察值和130个变量。其中之一是语言
,可以是英语
,法语
,阿拉伯语
...
I have a data set, dt.train2
with 1500 different observations and 130 variables. One of them is languages
and it can be english
, french
, arabic
...
我想创建 ifelse
字符串,为英语
, 2
1 >对于法语
, 3
对于西班牙语
和 0
。我不知道该怎么做。
I want to create a ifelse
string that gives attributes 1
for english
, 2
for french
, 3
for spanish
and 0
for anything else. I have no idea how to do it.
dt.train2[, language_string := ifelse(language == "english",
1,
ifelse(language == "french",
2,
ifelse(language == "spanish",3)]
我正在使用它运行有关销售的线性模型。
I'm using this to run linear model about sales.
推荐答案
ifelse()
差不多了,您只需要最后一个 else
结果(还有一对缺少的右括号)。
You're almost there with the ifelse()
, you just need the final else
result (and a couple missing closing parentheses).
dt.train2[, language_string := ifelse(
language == "english", 1,
ifelse(language == "french", 2,
ifelse(language == "spanish", 3, 0)
)
)
]
其他几种方法执行以下操作:
A couple other ways you could do this:
创建查找表并加入:
# sample data
dt = data.table(language = c("english", "french", "spanish", "arabic", "chinese", "pig latin"))
lookup = data.table(language = c("english", "french", "spanish"),
language_string = c(1, 2, 3))
dt2 = merge(dt, lookup, by = "language", all.x = TRUE)
dt2[is.na(language_string), language_string := 0]
上面的查找表方法可能是可伸缩性最好的方法。但是,对于这么少的编码,您也可以只设置每个编码:
The above lookup table method is probably the nicest for scalability. However, for such a small number of encodings, you could also just set each of them:
# start with the default, 0
dt[, language_string := 0 ]
# then do each of the exceptions
dt[lanuage == "english", language_string := 1]
dt[language == "french", language_string := 2]
dt[language == "spanish", language_string := 3]
这篇关于R中具有多个分类条件的Ifelse的文章就介绍到这了,希望我们推荐的答案对大家有所帮助,也希望大家多多支持IT屋!