R ddply,应用if和ifelse函数 [英] R ddply, applying if and ifelse functions

查看:123
本文介绍了R ddply,应用if和ifelse函数的处理方法,对大家解决问题具有一定的参考价值,需要的朋友们下面随着小编来一起学习吧!

问题描述

我正在尝试使用plyr包中的ddply将函数应用于数据框,但是得到了一些我不理解的结果.我有3个关于 结果

I'm trying to apply a function to a dataframe using ddply from the plyr package, but I'm getting some results that I don't understand. I have 3 questions about the results

给出:

mydf<- data.frame(c(12,34,9,3,22,55),c(1,2,1,1,2,2)
                  , c(0,1,2,1,1,2))
colnames(mydf)[1] <- 'n'
colnames(mydf)[2] <- 'x'
colnames(mydf)[3] <- 'x1'

mydf看起来像这样:

mydf looks like this:

   n x x1
1 12 1  0
2 34 2  1
3  9 1  2
4  3 1  1
5 22 2  1
6 55 2  2

问题#1

如果我这样做:

Question #1

If I do:

k <- function(x) {
  mydf$z <- ifelse(x == 1, 0, mydf$n)
  return (mydf)
}
mydf <- ddply(mydf, c("x") , .fun = k, .inform = TRUE)

我收到以下错误:

Error in `$<-.data.frame`(`*tmp*`, "z", value = structure(c(12, 34, 9,  : 
  replacement has 3 rows, data has 6
Error: with piece 1: 
   n x x1
1 12 1  0
2  9 1  2
3  3 1  1

无论是否将变量指定为c("x"),"x"或.(x),我都会收到此错误.我不明白为什么会收到此错误消息.

I get this error regardless of whether I specify the variable to split by as c("x"), "x", or .(x). I don't understand why I'm getting this error message.

但是,我真正想做的是设置一个if/else函数,因为我的数据集具有变量x1,x2,x3和x4,并且我也想将这些变量也考虑在内.但是当我尝试一些简单的事情时,例如:

But, what I really want to do is set up an if/else function because my dataset has variables x1, x2, x3, and x4 and I want to take those variables into account as well. But when I try something simple such as:

j <- function(x) {
  if(x == 1){
    mydf$z <- 0
  } else {
    mydf$z <- mydf$n
  }
  return(mydf)
  }

mydf <- ddply(mydf, x, .fun = j, .inform = TRUE)

我得到:

Warning messages:
1: In if (x == 1) { :
  the condition has length > 1 and only the first element will be used
2: In if (x == 1) { :
  the condition has length > 1 and only the first element will be used

问题#3

我对使用function()和何时使用function(x)感到困惑.对j()或k()使用function()会给我一个不同的错误:

Question #3

I'm confused about to use function() and when to use function(x). Using function() for either j() or k() gives me a different error:

Error in .fun(piece, ...) : unused argument (piece)
Error: with piece 1: 
    n x x1  z
1  12 1  0 12
2   9 1  2  9
3   3 1  1  3
4  12 1  0 12
5   9 1  2  9
6   3 1  1  3
7  12 1  0 12
8   9 1  2  9
9   3 1  1  3
10 12 1  0 12
11  9 1  2  9
12  3 1  1  3

其中z列不正确.但是我看到很多函数都写为function().

where column z is not correct. Yet I see a lot of functions written as function().

我衷心感谢任何可以帮助我解决这个问题的评论

I sincerely appreciate any comments that can help me out with this

推荐答案

这里有很多需要解释的地方.让我们从最简单的情况开始.在第一个示例中,您需要做的是:

There's a lot that needs explaining here. Let's start with the simplest case. In your first example, all you need is:

mydf$z <- with(mydf,ifelse(x == 1,0,n))

等效的ddply解决方案可能如下所示:

An equivalent ddply solution might look like this:

ddply(mydf,.(x),transform,z = ifelse(x == 1,0,n))

最大的困惑可能是您似乎不了解ddply中作为参数传递给函数的内容.

Probably your biggest source of confusion is that you seem to not understand what is being passed as arguments to functions within ddply.

考虑您的第一次尝试:

k <- function(x) {
  mydf$z <- ifelse(x == 1, 0, mydf$n)
  return (mydf)
}

ddply的工作方式是根据x列中的值将mydf拆分为几个较小的数据帧.这意味着每次ddply调用k时,传递给k的参数都是数据帧.具体来说,您的主要数据帧的一个子集.

The way ddply works is that it splits mydf up into several, smaller data frame, based on the values in the column x. That means that each time ddply calls k, the argument passed to k is a data frame. Specifically, a subset of you primary data frame.

因此在k中,xmydf的子集,具有所有列.您不应该尝试从k内部修改mydf.修改x,然后返回修改后的版本. (如果必须的话,但我上面显示的选项更好.)因此,我们可能会像这样重新编写您的k:

So within k, x is a subset of mydf, with all the columns. You should not be trying to modify mydf from within k. Modify x, and then return the modified version. (If you must, but the options I displayed above are better.) So we might re-write your k like this:

k <- function(x) {
  x$z <- ifelse(x$x == 1, 0, x$n)
  return (x)
}

请注意,您已经使用x作为k 的参数和作为我们其中一列的名称,创建了一些令人困惑的东西.

Note that you've created some confusing stuff by using x as both an argument to k and as the name of one of our columns.

这篇关于R ddply,应用if和ifelse函数的文章就介绍到这了,希望我们推荐的答案对大家有所帮助,也希望大家多多支持IT屋!

查看全文
登录 关闭
扫码关注1秒登录
发送“验证码”获取 | 15天全站免登陆