按列名称将data.frame分组 [英] Split data.frame into groups by column name

查看:112
本文介绍了按列名称将data.frame分组的处理方法,对大家解决问题具有一定的参考价值,需要的朋友们下面随着小编来一起学习吧!

问题描述

我是R的新手。我有一个数据框,其列名具有以下类型:

I'm new to R. I have a data frame with column names of such type:

file_001   file_002   block_001   block_002   red_001   red_002 ....etc'  
  0.05       0.2        0.4         0.006       0.05       0.3
  0.01       0.87       0.56        0.4         0.12       0.06

我想按列名将它们分成几组,以获得如下结果:

I want to split them into groups by the column name, to get a result like this:

group_file
file_001   file_002
  0.05       0.2
  0.01       0.87

group_block
block_001   block_002
  0.4        0.006
  0.56       0.4

group_red
red_001    red_002
  0.05       0.3
  0.12       0.06

...etc'

我的文件很大。我没有一定数量的团体。
只需在列名的开头即可。

My file is huge. I don't have a certain number of groups. It needs to be just by the column name's start.

推荐答案

在基本R中,可以使用 sub split.default 这样返回数据列表。

In base R, you can use sub and split.default like this to return a list of data.frames:

myDfList <- split.default(dat, sub("_\\d+", "", names(dat)))

此返回

myDfList
$block
  block_001 block_002
1      0.40     0.006
2      0.56     0.400

$file
  file_001 file_002
1     0.05     0.20
2     0.01     0.87

$red
  red_001 red_002
1    0.05    0.30
2    0.12    0.06

split.default 将根据变量的第二个参数对data.frames进行拆分。在这里,我们使用 sub 和正则表达式 _\d +删除下划线和其后的所有数字值,以返回拆分值 block,

split.default will split data.frames by variable according to its second argument. Here, we use sub and the regular expression "_\d+" to remove the underscore and all numeric values following it in order to return the splitting values "block", "file", and "red".

通常,将这些data.frames保留在列表中并通过诸如愉快地。请参阅gregor对此帖子的回答

As a side note, it is typically a good idea to keep these data.frames in a list and work with them through functions like lapply. See gregor's answer to this post for some motivating examples.

这篇关于按列名称将data.frame分组的文章就介绍到这了,希望我们推荐的答案对大家有所帮助,也希望大家多多支持IT屋!

查看全文
登录 关闭
扫码关注1秒登录
发送“验证码”获取 | 15天全站免登陆