使用ifelse添加带有条件值的新列 [英] Adding new column with conditional values using ifelse

查看:525
本文介绍了使用ifelse添加带有条件值的新列的处理方法,对大家解决问题具有一定的参考价值,需要的朋友们下面随着小编来一起学习吧!

问题描述

我有一个包含400,000多个观测数据的数据框,我正在尝试为其添加一列,其值取决于另一列,有时还有多列。

I have a data frame with more than 400.000 observations and I'm trying to add a column to it which its values depend on another column and sometimes multiple ones.

以下是我正在尝试做的更简单的示例:

Here is a simpler example of what I'm trying to do :

# Creating a data frame 

M <- data.frame(c("A","B","C"),c(5,100,60))

names(M) <- c("Letter","Number")

#adding a column 

M$Size <- NA

# if Number <= 50 Size is small, 
# if Number is between 50 and 70, Size is Medium
# if Number is Bigger than 70, Size is Big

ifelse (M$Number <=50, M$Size <-"Small",
        ifelse(M$Number <= 70,
        M$Size <- "Medium",
        M$Size <- "Big"
        ))

当我运行代码时,我得到的输出是:

When I run the Code, the output I get is :

[1] "Small"  "Big"    "Medium"

但是M中的Size列始终是最后一列ifelse函数中的条件:

But the "Size" column in M is always the last condition in the ifelse function :

> print (M)
  Letter Number Size
1      A      5  Big
2      B    100  Big
3      C     60  Big

我想要的结果:

> print (M)
  Letter Number Size
1      A      5  Small
2      B    100  Big
3      C     60  Medium

我可以通过对每个条件子集进行子集化并使用 rbind 得到我想要的结果,但代码会很长,因为我正在处理的原始数据框很大,所以运行需要更多时间。所以我想知道如何解决这个问题?

I can solve the problem by subsetting each conditionsubset and using rbind to get the result I want but the code will be very long and since the original data frame I'm working on is big, it'll take more time to run. So I'm wondering how can I fix this issue ?

推荐答案

使用 cut

M$Size <- cut(M$Number, breaks = c(-Inf, 50, 70, Inf), 
                        labels = c("small", "medium", "large"))
#   etter Number   Size
#1      A      5  small
#2      B    100  large
#3      C     60 medium

这篇关于使用ifelse添加带有条件值的新列的文章就介绍到这了,希望我们推荐的答案对大家有所帮助,也希望大家多多支持IT屋!

查看全文
登录 关闭
扫码关注1秒登录
发送“验证码”获取 | 15天全站免登陆