使用R的plyr包重新排序数据帧中的组 [英] Using R's plyr package to reorder groups within a dataframe

查看:148
本文介绍了使用R的plyr包重新排序数据帧中的组的处理方法,对大家解决问题具有一定的参考价值,需要的朋友们下面随着小编来一起学习吧!

问题描述

我有一个数据重组任务,我认为可以通过 R plyr 包来处理。我有一个数据框,数组数据组织在一起。在每个组中,我需要将数据排序为最大到最小。

I have a data reorganization task that I think could be handled by R's plyr package. I have a dataframe with numeric data organized in groups. Within each group I need to have the data sorted largest to smallest.

数据看起来像这样(下面生成的代码)

group     value
2     b 0.1408790
6     b 1.1450040   #2nd b is smaller than 1st
1     c 5.7433568
3     c 2.2109819
4     d 0.5384659
5     d 4.5382979

我想要的是这个。

group     value
b 1.1450040  #1st b is largest
b 0.1408790
c 5.7433568
c 2.2109819
d 4.5382979
d 0.5384659

所以,我需要的是 plyr 来完成每个组&在数字数据上应用诸如 order 之类的东西,按顺序重新组织,保存重新排序的数据子集&将其全部重新整理在一起。

So, what I need plyr to do is go through each group & apply something like order on the numeric data, reorganize by order, save the reordered subset of data, & put it all back together at the end.

我可以使用列表&一些循环,但需要很长时间。这可以通过 plyr 在几行中完成?

I can process this "by hand" with a list & some loops, but it takes a long long time. Can this be done by plyr in a couple of lines?

示例数据 / p>

Example data

df.sz <-  6;groups <-c("a","b","c","d")
df <- data.frame(group = sample(groups,df.sz,replace = TRUE),
value = runif(df.sz,0,10),stringsAsFactors = FALSE)
df <- df[order(df$group),] #order by group letter

使用循环的低效方法

我目前的方法是将数据框分开 df 按组进入列表,将 order 应用于列表的每个元素,并使用重新排序的元素覆盖原始列表元素。然后我使用一个循环重新组合数据帧。 (作为一个学习活动,我也希望如何使这段代码更有效率,特别是使用 base R 函数将列表转换为数据框?)

My current approach is to separate the dataframe df into a list by groups, apply order to each element of the list, and overwrite the original list element with the reordered element. I then use a loop to re-assemble the dataframe. (As a learning exercise, I'd interested also in how to make this code more efficient. In particular, what would be the most efficient way using base R functions to turn a list into a dataframe?)

数据框中唯一的组的向量

Vector of the unique groups in the dataframe

groups.u <- unique(df$group)

创建空列表

my.list <- as.list(groups.u); names(my.list) <- groups.u

分解 df by $ group into list

Break up df by $group into list

for(i in 1:length(groups.u)){
  i.working <- which(df$group == groups.u[i]) 
  my.list[[i]] <- df[i.working, ]
}

使用订单

for(i in 1:length(my.list)){
  order.x <- order(my.list[[i]]$value,na.last = TRUE, decreasing = TRUE)
  my.list[[i]] <- my.list[[i]][order.x, ] 
}

最后重建df列表。 1st,make seed for loop

Finally rebuild df from the list. 1st, make seed for loop

new.df <- my.list[[1]][1,];; new.df[1,] <- NA
for(i in 1:length(my.list)){
  new.df <- rbind(new.df,my.list[[i]])
}

删除种子

new.df <- new.df[-1,]


推荐答案

您可以使用 dplyr 这是一个较新版本的 plyr ,专注于数据框架:

You could use dplyr which is a newer version of plyr that focuses on data frames:

library(dplyr)
arrange(df, group, desc(value))

这篇关于使用R的plyr包重新排序数据帧中的组的文章就介绍到这了,希望我们推荐的答案对大家有所帮助,也希望大家多多支持IT屋!

查看全文
登录 关闭
扫码关注1秒登录
发送“验证码”获取 | 15天全站免登陆