如何使用 ifelse() 命令在 R 中创建虚拟变量 [英] How to create a dummy variable in R using ifelse() command
问题描述
我正在尝试为 R 创建一个虚拟变量.问题是我的餐厅类型"数据集下有许多分类变量.其中,我希望素食餐厅的值为 1,其余为 0.因此,当我运行回归摘要时,我得到了截距,b1 为 review_number,b2 为素食餐厅.例如,非素食餐厅将是 y=b0+b1(reviews_number),而素食餐厅将是 y=b0+b1(reviews_number)+b2(Vegan).提示是使用 ifelse() 命令,但我似乎不能将系数简化为 3.否则,我需要为每种类型的餐厅分别创建一个值......
I am trying to create a dummy variable for R. The thing is there are many categorical variables under my dataset of restaurants 'type'. Among them, I want Vegan restaurants to have value 1 and the rest to be 0. So when I run summary of the regression, I get the intercept, and b1 as reviews_number and b2 as vegan restaurants. For example, a non-vegan restaurant would be y=b0+b1(reviews_number) and a vegan restaurant will be y=b0+b1(reviews_number)+b2(Vegan). The hint is to use ifelse()command, but I can't seem to simplify the coefficients to just 3. Or else, I need to create a value for each type of restaurant respectively......
推荐答案
假设您的数据框名为 df
,您可以使用以下方法创建虚拟变量 (Vegan
):
Assuming your data frame is called df
, you can create your dummy variable (Vegan
) using:
df$Vegan <- ifelse(df$type == "Vegan", 1, 0) # where variable type is type of restaurants
但是,您应该注意,如果 type
是存储为因子,您还可以使用 y=b0+ 获得每种类型餐厅的系数(与参考水平相比)b1(reviews_number)+b2(type) 即 y~reviews+type
,如@mlt 所指.
However, you should note that if type
is a stored as factor, you can also get the coefficient on each type of restaurants (compared to the reference level) using y=b0+b1(reviews_number)+b2(type) i.e. y~reviews+type
, as pointed by @mlt.
这篇关于如何使用 ifelse() 命令在 R 中创建虚拟变量的文章就介绍到这了,希望我们推荐的答案对大家有所帮助,也希望大家多多支持IT屋!