嵌套 ifelse 语句 [英] Nested ifelse statement

查看:29
本文介绍了嵌套 ifelse 语句的处理方法,对大家解决问题具有一定的参考价值,需要的朋友们下面随着小编来一起学习吧!

问题描述

我仍在学习如何将 SAS 代码转换为 R 并且收到警告.我需要了解我在哪里犯了错误.我想要做的是创建一个变量来总结和区分人口的 3 种状态:大陆、海外、外国人.我有一个包含 2 个变量的数据库:

  • id 国籍:idnat(法国、外国人)、

如果 idnat 是法语,则:

  • id 出生地:idbp(大陆、殖民地、海外)

我想将来自 idnatidbp 的信息汇总到一个名为 idnat2 的新变量中:

  • 身份:k(大陆、海外、外国人)

所有这些变量都使用字符类型".

idnat2 列中的预期结果:

 idnat idbp idnat21法国大陆大陆2 法国海外殖民地3 法国海外海外4 洋洋洋洋

<小时>

这是我想用 R 翻译的 SAS 代码:

if idnat = "french" then do;如果 idbp 在 ("overseas","colony") 然后 idnat2 = "overseas";else idnat2 = "大陆";结尾;else idnat2 = "外国人";跑;

<小时>

这是我在 R 中的尝试:

if(idnat=="法语"){idnat2 <-大陆"} else if(idbp=="海外"|idbp=="殖民地"){idnat2 <-海外"} 别的 {idnat2 <-外国人"}

我收到此警告:

警告信息:在 if (idnat=="french") { :条件具有长度 >1 并且只使用第一个元素

有人建议我使用嵌套的 ifelse",因为它很容易,但会收到更多警告:

idnat2 <- ifelse (idnat=="法语", "大陆",ifelse (idbp=="海外"|idbp=="殖民地", "海外"))否则(idnat2 <-外国人")

根据警告消息,长度大于 1,因此只会考虑第一个括号之间的内容.抱歉,但我不明白这个长度与这里有什么关系?有人知道我错在哪里吗?

解决方案

如果您正在使用任何电子表格应用程序,则有一个基本函数 if() 语法:

if(, , )

语法与 R 中的 ifelse() 完全相同:

ifelse(, , )

在电子表格应用程序中与 if() 的唯一区别是 R ifelse() 是矢量化的(将矢量作为输入并在输出时返回矢量).考虑以下电子表格应用程序和 R 中公式的比较,例如我们想比较 a > b 并在是时返回 1,否则返回 0.

在电子表格中:

 A B C1 3 1 =if(A1 > B1, 1, 0)2 2 2 =if(A2 > B2, 1, 0)3 1 3 =if(A3 > B3, 1, 0)

在 R 中:

><- 3:1;b <- 1:3>ifelse(a > b, 1, 0)[1] 1 0 0

ifelse() 可以通过多种方式嵌套:

ifelse(, , ifelse(, , ))ifelse(<condition>, ifelse(<condition>, <yes>, <no>), <no>)ifelse(<条件>,ifelse(<condition>,<yes>,<no>),ifelse(<condition>,<yes>,<no>))ifelse(<condition>,<yes>,ifelse(<condition>,<yes>,ifelse(<condition>,<yes>,<no>)))

要计算列 idnat2,您可以:

df <- read.table(header=TRUE, text="idnat idbp idnat2法国大陆法国海外殖民地海外法国人外国外国外国")与(df,ifelse(idnat=="法语",ifelse(idbp %in% c("海外","殖民地"),"海外","大陆"),"外国"))

R 文档

什么是条件有长度>1 并且只使用第一个元素?让我们看看:

># 真正测试的第一个条件是什么?>与(df,idnat==法语")[1] 真真真假># 这是向量化函数的结果 - idnat 和># 字符串 "french" 被测试.># 返回逻辑值向量(与 idnat 长度相同)>df$idnat2 <- with(df,+ if(idnat=="法语"){+ idnat2 <- "xxx"+ }+)警告信息:在 if (idnat == "french") { 中:条件具有长度 >1 并且只使用第一个元素># 请注意,比较的第一个元素是 TRUE,这就是我们得到的原因:>dfidnat idbp idnat21 法国大陆 xxx2 法国 殖民地 xxx3 法国 海外 xxx4 外国 外国 xxx>#里面真的有逻辑,你要习惯

我还可以使用 if() 吗?是的,你可以,但语法不是那么酷:)

test <- function(x) {如果(x==法语"){法语"} 别的{不是真正的法国人"}}应用(数组(df [[idnat"]]),保证金=1,乐趣=测试)

如果你熟悉 SQL,你也可以使用 CASE statementsqldf 中.

I'm still learning how to translate a SAS code into R and I get warnings. I need to understand where I'm making mistakes. What I want to do is create a variable which summarizes and differentiates 3 status of a population: mainland, overseas, foreigner. I have a database with 2 variables:

  • id nationality: idnat (french, foreigner),

If idnat is french then:

  • id birthplace: idbp (mainland, colony, overseas)

I want to summarize the info from idnat and idbp into a new variable called idnat2:

  • status: k (mainland, overseas, foreigner)

All these variables use "character type".

Results expected in column idnat2 :

   idnat     idbp   idnat2
1  french mainland mainland
2  french   colony overseas
3  french overseas overseas
4 foreign  foreign  foreign


Here is my SAS code I want to translate in R:

if idnat = "french" then do;
   if idbp in ("overseas","colony") then idnat2 = "overseas";
   else idnat2 = "mainland";
end;
else idnat2 = "foreigner";
run;


Here is my attempt in R:

if(idnat=="french"){
    idnat2 <- "mainland"
} else if(idbp=="overseas"|idbp=="colony"){
    idnat2 <- "overseas"
} else {
    idnat2 <- "foreigner"
}

I receive this warning:

Warning message:
In if (idnat=="french") { :
  the condition has length > 1 and only the first element will be used

I was advised to use a "nested ifelse" instead for its easiness but get more warnings:

idnat2 <- ifelse (idnat=="french", "mainland",
        ifelse (idbp=="overseas"|idbp=="colony", "overseas")
      )
            else (idnat2 <- "foreigner")

According to the Warning message, the length is greater than 1 so only what's between the first brackets will be taken into account. Sorry but I don't understand what this length has to do with here? Anybody know where I'm wrong?

解决方案

If you are using any spreadsheet application there is a basic function if() with syntax:

if(<condition>, <yes>, <no>)

Syntax is exactly the same for ifelse() in R:

ifelse(<condition>, <yes>, <no>)

The only difference to if() in spreadsheet application is that R ifelse() is vectorized (takes vectors as input and return vector on output). Consider the following comparison of formulas in spreadsheet application and in R for an example where we would like to compare if a > b and return 1 if yes and 0 if not.

In spreadsheet:

  A  B C
1 3  1 =if(A1 > B1, 1, 0)
2 2  2 =if(A2 > B2, 1, 0)
3 1  3 =if(A3 > B3, 1, 0)

In R:

> a <- 3:1; b <- 1:3
> ifelse(a > b, 1, 0)
[1] 1 0 0

ifelse() can be nested in many ways:

ifelse(<condition>, <yes>, ifelse(<condition>, <yes>, <no>))

ifelse(<condition>, ifelse(<condition>, <yes>, <no>), <no>)

ifelse(<condition>, 
       ifelse(<condition>, <yes>, <no>), 
       ifelse(<condition>, <yes>, <no>)
      )

ifelse(<condition>, <yes>, 
       ifelse(<condition>, <yes>, 
              ifelse(<condition>, <yes>, <no>)
             )
       )

To calculate column idnat2 you can:

df <- read.table(header=TRUE, text="
idnat idbp idnat2
french mainland mainland
french colony overseas
french overseas overseas
foreign foreign foreign"
)

with(df, 
     ifelse(idnat=="french",
       ifelse(idbp %in% c("overseas","colony"),"overseas","mainland"),"foreign")
     )

R Documentation

What is the condition has length > 1 and only the first element will be used? Let's see:

> # What is first condition really testing?
> with(df, idnat=="french")
[1]  TRUE  TRUE  TRUE FALSE
> # This is result of vectorized function - equality of all elements in idnat and 
> # string "french" is tested.
> # Vector of logical values is returned (has the same length as idnat)
> df$idnat2 <- with(df,
+   if(idnat=="french"){
+   idnat2 <- "xxx"
+   }
+   )
Warning message:
In if (idnat == "french") { :
  the condition has length > 1 and only the first element will be used
> # Note that the first element of comparison is TRUE and that's whay we get:
> df
    idnat     idbp idnat2
1  french mainland    xxx
2  french   colony    xxx
3  french overseas    xxx
4 foreign  foreign    xxx
> # There is really logic in it, you have to get used to it

Can I still use if()? Yes, you can, but the syntax is not so cool :)

test <- function(x) {
  if(x=="french") {
    "french"
  } else{
    "not really french"
  }
}

apply(array(df[["idnat"]]),MARGIN=1, FUN=test)

If you are familiar with SQL, you can also use CASE statement in sqldf package.

这篇关于嵌套 ifelse 语句的文章就介绍到这了,希望我们推荐的答案对大家有所帮助,也希望大家多多支持IT屋!

查看全文
登录 关闭
扫码关注1秒登录
发送“验证码”获取 | 15天全站免登陆