使用R的plyr包重新排序数据帧中的组 [英] Using R's plyr package to reorder groups within a dataframe
问题描述
我有一个数据重组任务,我认为可以通过 R
的 plyr
包来处理。我有一个数据框,数组数据组织在一起。在每个组中,我需要将数据排序为最大到最小。
I have a data reorganization task that I think could be handled by R
's plyr
package. I have a dataframe with numeric data organized in groups. Within each group I need to have the data sorted largest to smallest.
数据看起来像这样(下面生成的代码)
group value
2 b 0.1408790
6 b 1.1450040 #2nd b is smaller than 1st
1 c 5.7433568
3 c 2.2109819
4 d 0.5384659
5 d 4.5382979
我想要的是这个。
group value
b 1.1450040 #1st b is largest
b 0.1408790
c 5.7433568
c 2.2109819
d 4.5382979
d 0.5384659
所以,我需要的是 plyr
来完成每个组&在数字数据上应用诸如 order
之类的东西,按顺序重新组织,保存重新排序的数据子集&将其全部重新整理在一起。
So, what I need plyr
to do is go through each group & apply something like order
on the numeric data, reorganize by order, save the reordered subset of data, & put it all back together at the end.
我可以使用列表&一些循环,但需要很长时间。这可以通过 plyr
在几行中完成?
I can process this "by hand" with a list & some loops, but it takes a long long time. Can this be done by plyr
in a couple of lines?
示例数据 / p>
Example data
df.sz <- 6;groups <-c("a","b","c","d")
df <- data.frame(group = sample(groups,df.sz,replace = TRUE),
value = runif(df.sz,0,10),stringsAsFactors = FALSE)
df <- df[order(df$group),] #order by group letter
使用循环的低效方法
我目前的方法是将数据框分开 df
按组进入列表,将 order
应用于列表的每个元素,并使用重新排序的元素覆盖原始列表元素。然后我使用一个循环重新组合数据帧。 (作为一个学习活动,我也希望如何使这段代码更有效率,特别是使用 base
R
函数将列表转换为数据框?)
My current approach is to separate the dataframe df
into a list by groups, apply order
to each element of the list, and overwrite the original list element with the reordered element. I then use a loop to re-assemble the dataframe. (As a learning exercise, I'd interested also in how to make this code more efficient. In particular, what would be the most efficient way using base
R
functions to turn a list into a dataframe?)
数据框中唯一的组的向量
Vector of the unique groups in the dataframe
groups.u <- unique(df$group)
创建空列表
my.list <- as.list(groups.u); names(my.list) <- groups.u
分解 df
by $ group
into list
Break up df
by $group
into list
for(i in 1:length(groups.u)){
i.working <- which(df$group == groups.u[i])
my.list[[i]] <- df[i.working, ]
}
使用订单
for(i in 1:length(my.list)){
order.x <- order(my.list[[i]]$value,na.last = TRUE, decreasing = TRUE)
my.list[[i]] <- my.list[[i]][order.x, ]
}
最后重建df列表。 1st,make seed for loop
Finally rebuild df from the list. 1st, make seed for loop
new.df <- my.list[[1]][1,];; new.df[1,] <- NA
for(i in 1:length(my.list)){
new.df <- rbind(new.df,my.list[[i]])
}
删除种子
new.df <- new.df[-1,]
推荐答案
您可以使用 dplyr 这是一个较新版本的 plyr
,专注于数据框架:
You could use dplyr which is a newer version of plyr
that focuses on data frames:
library(dplyr)
arrange(df, group, desc(value))
这篇关于使用R的plyr包重新排序数据帧中的组的文章就介绍到这了,希望我们推荐的答案对大家有所帮助,也希望大家多多支持IT屋!