如何根据多列的多个条件创建新列? [英] How do I create a new column based on multiple conditions from multiple columns?

查看:186
本文介绍了如何根据多列的多个条件创建新列?的处理方法,对大家解决问题具有一定的参考价值,需要的朋友们下面随着小编来一起学习吧!

问题描述

我正在尝试根据其他列的几个条件向数据框添加新列。我有以下数据:

I'm trying add a new column to a data frame based on several conditions from other columns. I have the following data:

> commute <- c("walk", "bike", "subway", "drive", "ferry", "walk", "bike", "subway", "drive", "ferry", "walk", "bike", "subway", "drive", "ferry")
> kids <- c("Yes", "Yes", "No", "No", "Yes", "Yes", "No", "No", "Yes", "Yes", "No", "No", "Yes", "No", "Yes")
> distance <- c(1, 12, 5, 25, 7, 2, "", 8, 19, 7, "", 4, 16, 12, 7)
> 
> df = data.frame(commute, kids, distance)
> df
   commute kids distance
1     walk  Yes        1
2     bike  Yes       12
3   subway   No        5
4    drive   No       25
5    ferry  Yes        7
6     walk  Yes        2
7     bike   No         
8   subway   No        8
9    drive  Yes       19
10   ferry  Yes        7
11    walk   No         
12    bike   No        4
13  subway  Yes       16
14   drive   No       12
15   ferry  Yes        7

如果满足以下三个条件:

If the following three conditions are met:

commute = walk OR bike OR subway OR ferry
AND
kids = Yes
AND
distance is less than 10

然后我想要一个名为get.flyer的新列等于是。最终数据框应如下所示:

Then I'd like a new column called get.flyer to equal "Yes". The final data frame should look like this:

   commute kids distance get.flyer
1     walk  Yes        1       Yes
2     bike  Yes       12       Yes
3   subway   No        5          
4    drive   No       25          
5    ferry  Yes        7       Yes
6     walk  Yes        2       Yes
7     bike   No                   
8   subway   No        8          
9    drive  Yes       19          
10   ferry  Yes        7       Yes
11    walk   No                   
12    bike   No        4          
13  subway  Yes       16       Yes
14   drive   No       12          
15   ferry  Yes        7       Yes


推荐答案

我们可以在%中使用%来比较列中的多个元素,& 来检查两个条件是否都有是真的。

We can use %in% for comparing multiple elements in a column, & to check if both conditions are TRUE.

library(dplyr)
df %>%
     mutate(get.flyer = c("", "Yes")[(commute %in% c("walk", "bike", "subway", "ferry") & 
           as.character(kids) == "Yes" & 
           as.numeric(as.character(distance)) < 10)+1] )






最好是使用 stringsAsFactors = FALSE 创建 data.frame ,因为默认情况下它是 TRUE 。如果我们检查 str(df),我们可以发现所有列都是 factor class。此外,如果缺少值,而不是,可以使用 NA 来避免转换 class 一个数字列到其他地方。


It is better to create the data.frame with stringsAsFactors=FALSE as by default it is TRUE. If we check the str(df), we can find that all the columns are factor class. Also, if there are missing values, instead of "", NA can be used to avoid converting the class of a numeric column to something else.

如果我们改写创建'df'

If we rewrite the creation of 'df'

distance <- c(1, 12, 5, 25, 7, 2, NA, 8, 19, 7, NA, 4, 16, 12, 7)
df1 <- data.frame(commute, kids, distance, stringsAsFactors=FALSE)

以上代码可以简化

df1 %>%
    mutate(get.flyer = c("", "Yes")[(commute %in% c("walk", "bike", "subway", "ferry") &
        kids == "Yes" &
        distance < 10)+1] )

为了更好地理解,有些人更喜欢 ifelse

For better understanding, some people prefer ifelse

df1 %>% 
   mutate(get.flyer = ifelse(commute %in% c("walk", "bike", "subway", "ferry") & 
                kids == "Yes" &
                distance < 10, 
                          "Yes", ""))

这也可以通过轻松完成base R methods

This can be also done easily with base R methods

df1$get.flyer <- with(df1, ifelse(commute %in% c("walk", "bike", "subway", "ferry") & 
              kids == "Yes" & 
              distance < 10, 
                       "Yes", ""))

这篇关于如何根据多列的多个条件创建新列?的文章就介绍到这了,希望我们推荐的答案对大家有所帮助,也希望大家多多支持IT屋!

查看全文
登录 关闭
扫码关注1秒登录
发送“验证码”获取 | 15天全站免登陆