R Plyr-DDPLY的订购结果? [英] R Plyr - Ordering results from DDPLY?
问题描述
有人知道一种巧妙的方法来排序来自ddply摘要操作的结果吗?
Does anyone know a slick way to order the results coming out of a ddply summarise operation?
这是我要使输出按降序排序的操作.
This is what I'm doing to get the output ordered by descending depth.
ddims <- ddply(diamonds, .(color), summarise, depth = mean(depth), table = mean(table))
ddims <- ddims[order(-ddims$depth),]
有输出...
> ddims
color depth table
7 J 61.88722 57.81239
6 I 61.84639 57.57728
5 H 61.83685 57.51781
4 G 61.75711 57.28863
1 D 61.69813 57.40459
3 F 61.69458 57.43354
2 E 61.66209 57.49120
不太丑陋,但是我希望在ddply()中很好地做到这一点.有人知道吗?
Not too ugly, but I'm hoping for a way do it nicely within ddply(). Anyone know how?
哈德利(Hadley)的ggplot2书使用了ddply和subset的示例,但实际上并没有对输出进行排序,只是每组选择两个最小的菱形.
Hadley's ggplot2 book has this example for ddply and subset but it's not actually sorting the output, just selecting the two smallest diamonds per group.
ddply(diamonds, .(color), subset, order(carat) <= 2)
推荐答案
在这种情况下,我将为data.table
做些广告,它运行得更快,并且(据我所知)至少写得很优雅:
I'll use this occasion to advertise a bit for data.table
, which is faster to run and (in my perception) at least as elegant to write:
library(data.table)
ddims <- data.table(diamonds)
system.time(ddims <- ddims[, list(depth=mean(depth), table=mean(table)), by=color][order(depth)])
user system elapsed
0.003 0.000 0.004
通过对比,无需订购,您的ddply
代码已经花费了30倍的时间:
By contrast, without ordering, your ddply
code already takes 30 times longer:
user system elapsed
0.106 0.010 0.119
我对哈德利的出色作品表示敬意,例如在ggplot2
上,以及一般的出色表现,我必须承认,对我来说,出于速度原因,data.table
完全取代了ddply
.
With all the respect I have for Hadley's excellent work, e.g. on ggplot2
, and general awesomeness, I must confess that for me, data.table
entirely replaced ddply
-- for speed reasons.
这篇关于R Plyr-DDPLY的订购结果?的文章就介绍到这了,希望我们推荐的答案对大家有所帮助,也希望大家多多支持IT屋!