按列名称将data.frame分组 [英] Split data.frame into groups by column name
问题描述
我是R的新手。我有一个数据框,其列名具有以下类型:
I'm new to R. I have a data frame with column names of such type:
file_001 file_002 block_001 block_002 red_001 red_002 ....etc'
0.05 0.2 0.4 0.006 0.05 0.3
0.01 0.87 0.56 0.4 0.12 0.06
我想按列名将它们分成几组,以获得如下结果:
I want to split them into groups by the column name, to get a result like this:
group_file
file_001 file_002
0.05 0.2
0.01 0.87
group_block
block_001 block_002
0.4 0.006
0.56 0.4
group_red
red_001 red_002
0.05 0.3
0.12 0.06
...etc'
我的文件很大。我没有一定数量的团体。
只需在列名的开头即可。
My file is huge. I don't have a certain number of groups. It needs to be just by the column name's start.
推荐答案
在基本R中,可以使用 sub
和 split.default
这样返回数据列表。
In base R, you can use sub
and split.default
like this to return a list of data.frames:
myDfList <- split.default(dat, sub("_\\d+", "", names(dat)))
此返回
myDfList
$block
block_001 block_002
1 0.40 0.006
2 0.56 0.400
$file
file_001 file_002
1 0.05 0.20
2 0.01 0.87
$red
red_001 red_002
1 0.05 0.30
2 0.12 0.06
split.default
将根据变量的第二个参数对data.frames进行拆分。在这里,我们使用 sub
和正则表达式 _\d +删除下划线和其后的所有数字值,以返回拆分值 block,
split.default
will split data.frames by variable according to its second argument. Here, we use sub
and the regular expression "_\d+" to remove the underscore and all numeric values following it in order to return the splitting values "block", "file", and "red".
通常,将这些data.frames保留在列表中并通过诸如愉快地
。请参阅gregor对此帖子的回答
As a side note, it is typically a good idea to keep these data.frames in a list and work with them through functions like lapply
. See gregor's answer to this post for some motivating examples.
这篇关于按列名称将data.frame分组的文章就介绍到这了,希望我们推荐的答案对大家有所帮助,也希望大家多多支持IT屋!