在R中的数据帧的每一行上执行plyr操作 [英] doing a plyr operation on every row of a data frame in R

查看:105
本文介绍了在R中的数据帧的每一行上执行plyr操作的处理方法,对大家解决问题具有一定的参考价值,需要的朋友们下面随着小编来一起学习吧!

问题描述

我喜欢plyr语法.每当我必须使用* apply()命令之一时,我都会踢狗并进行3天的弯道.那么,为了我的狗和我的肝脏,在数据帧的每一行上执行ddply操作的简洁语法是什么?

I like the plyr syntax. Any time I have to use one of the *apply() commands I end up kicking the dog and going on a 3 day bender. So for the sake of my dog and my liver, what's concise syntax for doing a ddply operation on every row of a data frame?

以下是一个适用于简单情况的示例:

Here's an example that works well for a simple case:

x <- rnorm(10)
y <- rnorm(10)
df <- data.frame(x,y)
ddply(df,names(df) ,function(df) max(df$x,df$y))

效果很好,可以给我我想要的东西.但是,如果事情变得更复杂,则会导致plyr变得时髦(不像Bootsy Collins那样),因为plyr会尝试从所有这些浮点值中创建级别"

that works fine and gives me what I want. But if things get more complex this causes plyr to get funky (and not like Bootsy Collins) because plyr is chewing on making "levels" out of all those floating point values

x <- rnorm(1000)
y <- rnorm(1000)
z <- rnorm(1000)
myLetters <- sample(letters, 1000, replace=T)
df <- data.frame(x,y, z, myLetters)
ddply(df,names(df) ,function(df) max(df$x,df$y))

在我的盒子上咀嚼几分钟,然后返回:

on my box this chews for a few minutes and then returns:

Error: memory exhausted (limit reached?)
In addition: Warning messages:
1: In paste(rep(l, each = ll), rep(lvs, length(l)), sep = sep) :
  Reached total allocation of 1535Mb: see help(memory.size)
2: In paste(rep(l, each = ll), rep(lvs, length(l)), sep = sep) :
  Reached total allocation of 1535Mb: see help(memory.size)

我认为我完全在滥用plyr,并不是说这是plyr的错误,而是我的虐待行为(尽管有肝脏和狗).

I think I am totally abusing plyr and I am not saying this is a bug in plyr, but rather abusive behavior by me (liver and dog notwithstanding).

因此,简而言之,是否存在使用ddply代替apply(X, 1, ...)在每一行上进行操作的语法快捷方式?

So in short, is there syntax shortcut for using ddply to operate on every row as a substitute for apply(X, 1, ...)?

我一直在使用的解决方法是创建一个键",该键为每一行提供唯一的值,然后我可以加入其中.

The workaround I've been using is to create a "key" that gives a unique value for every row and then I can join back to it.

 x <- rnorm(1000)
 y <- rnorm(1000)
 z <- rnorm(1000)
 myLetters <- sample(letters, 1000, replace=T)
 df <- data.frame(x,y, z, myLetters)
  #make the key
 df$myKey <- 1:nrow(df)
 myOut <- merge(df, ddply(df,"myKey" ,function(df) max(df$x,df$y)))
  #knock out the key
 myOut$myKey <- NULL

但我一直认为必须有更好的方法"

But I keep thinking that "There Has to Be a Better Way"

谢谢!

推荐答案

就像对待数组一样对待它,并在每一行上工作:

Just treat it like an array and work on each row:

adply(df, 1, transform, max = max(x, y))

这篇关于在R中的数据帧的每一行上执行plyr操作的文章就介绍到这了,希望我们推荐的答案对大家有所帮助,也希望大家多多支持IT屋!

查看全文
登录 关闭
扫码关注1秒登录
发送“验证码”获取 | 15天全站免登陆