根据几列创建带有二进制数据的新列 [英] Create new column with binary data based on several columns

查看:79
本文介绍了根据几列创建带有二进制数据的新列的处理方法,对大家解决问题具有一定的参考价值,需要的朋友们下面随着小编来一起学习吧!

问题描述

我有一个数据框,我想根据前一列中的记录用0/1创建一个新列(这表示一个物种的存在/不存在).我一直在尝试:

I have a dataframe in which I want to create a new column with 0/1 (which would represent absence/presence of a species) based on the records in previous columns. I've been trying this:

update_cat$bobpresent <- NA #creating the new column

x <- c("update_cat$bob1999", "update_cat$bob2000", "update_cat$bob2001","update_cat$bob2002", "update_cat$bob2003", "update_cat$bob2004", "update_cat$bob2005", "update_cat$bob2006","update_cat$bob2007", "update_cat$bob2008", "update_cat$bob2009") #these are the names of the columns I want the new column to base its results in

bobpresent <- function(x){
  if(x==NA)
    return(0)
  else
    return(1)
} # if all the previous columns are NA then the new column should be 0, otherwise it should be 1

update_cat $ bobpresence<-sapply(update_cat $ bobpresent,bobpresent)#将函数应用于新列

update_cat$bobpresence <- sapply(update_cat$bobpresent, bobpresent) #apply the function to the new column

一切顺利,直到我收到此错误的最后一个字符串为止:

Everything is going fina until the last string where I'm getting this error:

Error in if (x == NA) return(0) else return(1) : 
  missing value where TRUE/FALSE needed

有人可以建议我吗? 非常感谢您的帮助.

Can somebody please advise me? Your help will be much appreciated.

推荐答案

根据定义,对NA的所有操作都将产生NA,因此x == NA 始终的计算结果为NA.如果要检查值是否为NA,则必须使用is.na函数,例如:

By definition all operations on NA will yield NA, therefore x == NA always evaluates to NA. If you want to check if a value is NA, you must use the is.na function, for example:

> NA == NA
[1] NA
> is.na(NA)
[1] TRUE

您传递给sapply的函数期望将TRUE或FALSE作为返回值,但它改为获得NA,因此出现错误消息.您可以通过这样重写函数来解决此问题:

The function you pass to sapply expects TRUE or FALSE as return values but it gets NA instead, hence the error message. You can fix that by rewriting your function like this:

bobpresent <- function(x) { ifelse(is.na(x), 0, 1) }

无论如何,根据您的原始帖子,我不了解您要执行的操作.此更改只能解决您使用sapply时遇到的错误,但解决程序逻辑是另一回事,并且您的帖子中没有足够的信息.

In any case, based on your original post I don't understand what you're trying to do. This change only fixes the error you get with sapply, but fixing the logic of your program is a different matter, and there is not enough information in your post.

这篇关于根据几列创建带有二进制数据的新列的文章就介绍到这了,希望我们推荐的答案对大家有所帮助,也希望大家多多支持IT屋!

查看全文
登录 关闭
扫码关注1秒登录
发送“验证码”获取 | 15天全站免登陆