在r中合并(与split相反)行对 [英] merge (opposite of split) pair of rows in r

查看:63
本文介绍了在r中合并(与split相反)行对的处理方法,对大家解决问题具有一定的参考价值,需要的朋友们下面随着小编来一起学习吧!

问题描述

我有类似以下的专栏.每列有两对,每对都有后缀"a"和"b",例如col1a,col1b,colNa,colNb等,直到文件结尾(> 50000).

I have the column like the following. Each column has two pairs each with suffix "a" and "b" - for example col1a, col1b, colNa, colNb and so on till end of the file (> 50000).

mydataf <- data.frame (Ind = 1:5, col1a = sample (c(1:3), 5, replace = T), 
   col1b = sample (c(1:3), 5, replace = T),  colNa = sample (c(1:3), 5, replace = T),
   colNb = sample (c(1:3),5, replace = T),
     K_a = sample (c("A", "B"),5, replace = T),  
    K_b = sample (c("A", "B"),5, replace = T))

mydataf 
   Ind col1a col1b colNa colNb K_a K_b
1   1     1     1     2     3   B   A
2   2     1     3     2     2   B   B
3   3     2     1     1     1   B   B
4   4     3     1     1     3   A   B
5   5     1     1     3     2   B   A

除了第一列(Ind)外,我要折叠一对行以使数据框看起来像以下内容,同时删除后缀"a"和"b".合并的字符或数字也应从1开始优先于2,从A开始优先于B

Except First column (Ind), I want to collapse the pair of rows to make the dataframe look like the following, at the sametime the suffix "a" and "b" be removed. Also merged characters or number be ordered 1 first that 2, A first than B

   Ind col1   colN  K_
    1   11     23   AB   
    2   13     22   BB
    3   12     11   BB
    4   13     13   AB
    5   11     23   AB   

如果列名相似,则答案中的grep函数(可能)有问题.

The grep function (probably) in the answer has problem if the name of columns are similar.

mydataf <- data.frame (col_1_a = sample (c(1:3), 5, replace = T),
   col_1_b = sample (c(1:3), 5, replace = T),  col_1_Na = sample (c(1:3), 5, replace = T),
   col_1_Nb = sample (c(1:3),5, replace = T),
     K_a = sample (c("A", "B"),5, replace = T),
    K_b = sample (c("A", "B"),5, replace = T))
n <- names(mydataf)
nm <- c(unique(substr(n, 1, nchar(n)-1)))
df <- data.frame(sapply(nm, function(x){
                             idx <- grep(x, n)
                             cols <- mydataf[idx]
                             x <- apply(cols, 1,
                                       function(z) paste(sort(z), collapse = ""))
                             return(x)
                            }))
names(df) <- nm
df

 col_1_ col_1_N K_
1   2233      23 BB
2   2233      22 BB
3   1123      13 AB
4   1223      12 AB
5   2333      33 AB

推荐答案

mydataf
  Ind col1a col1b colNa colNb K_a K_b
1   1     2     1     1     1   A   A
2   2     1     2     1     3   B   A
3   3     1     2     3     2   A   A
4   4     1     2     3     1   A   B
5   5     1     2     2     1   A   A
n <- names(mydataf)
nm <- c("Ind", unique(substr(n, 1, nchar(n)-1)[-1]))
df <- data.frame(sapply(nm, function(x){
                             idx <- grep(paste0(x, "[ab]?$"), n)
                             cols <- mydataf[idx]
                             x <- apply(cols, 1, 
                                       function(z) paste(sort(z), collapse = ""))
                             return(x)
                            }))
names(df) <- nm
df
  Ind col1 colN K_
1   1   12   11 AA
2   2   12   13 AB
3   3   12   23 AA
4   4   12   13 AB
5   5   12   12 AA

这篇关于在r中合并(与split相反)行对的文章就介绍到这了,希望我们推荐的答案对大家有所帮助,也希望大家多多支持IT屋!

查看全文
登录 关闭
扫码关注1秒登录
发送“验证码”获取 | 15天全站免登陆