将带有两个不同后缀的数据框列堆叠为两列,最好使用tidyverse [英] Stack dataframe columns with two distinct suffix into two columns, preferably using tidyverse

查看:62
本文介绍了将带有两个不同后缀的数据框列堆叠为两列,最好使用tidyverse的处理方法,对大家解决问题具有一定的参考价值,需要的朋友们下面随着小编来一起学习吧!

问题描述

假设我有一个数据帧列表, mylist ,并且我想对每个数据帧执行相同的操作.

Suppose I have a list of dataframes, mylist and I want to do the same operation to each dataframes.

说我的数据框看起来像这样:

Say my dataframes look like this:

set.seed(1)
test.tbl <- tibble(
  case1_diff = rnorm(10,0),
  case1_avg = rnorm(10,0),
  case2_diff = rnorm(10,0),
  case2_avg = rnorm(10,0),
  case3_diff = rnorm(10,0),
  case3_avg = rnorm(10,0),
  case4_diff = rnorm(10,0),
  case4_avg = rnorm(10,0),
)
> head(test.tbl)
# A tibble: 6 x 8
  case1_diff case1_avg case2_diff case2_avg case3_diff case3_avg case4_diff case4_avg
       <dbl>     <dbl>      <dbl>     <dbl>      <dbl>     <dbl>      <dbl>     <dbl>
1     -0.626    1.51       0.919     1.36       -0.165     0.398     2.40       0.476
2      0.184    0.390      0.782    -0.103      -0.253    -0.612    -0.0392    -0.710
3     -0.836   -0.621      0.0746    0.388       0.697     0.341     0.690      0.611
4      1.60    -2.21      -1.99     -0.0538      0.557    -1.13      0.0280    -0.934
5      0.330    1.12       0.620    -1.38       -0.689     1.43     -0.743     -1.25 
6     -0.820   -0.0449    -0.0561   -0.415      -0.707     1.98      0.189      0.291

,我希望将它们堆叠为两列 diff avg 作为40 x 2数据帧.

and I wish to stack them into two columns of diff and avg as 40 x 2 dataframe.

通常,我会通过 select(ends_with("diff")) select(ends_with("avg")))将其分为两个对象,旋转它们,然后 bind_rows .

Normally, I would just separate it into two objects through select(ends_with("diff")) and select(ends_with("avg")), pivot them, then bind_rows.

但是,由于我的原始对象是列表,所以我想使用 map 来实现它,例如:

However, since my original object is list, I want to do it using map like:

mylist %>%
   map(*insertfunction1*) %>%
   map(*insertfunction2*) 

意味着我需要做到这一点而无需分开.我还需要确保 diff avg 正确配对.

meaning I would need to do this without separating. I would also need to make sure that diff and avg is correctly paired.

到目前为止我尝试过的是

What I have tried so far is

test.tbl %>%
  pivot_longer(cols=everything(),
               names_to = "metric") %>%
  mutate(metric = str_remove(metric,"[0-9]+")) %>%
  pivot_wider(id_cols=metric,
              values_from=value)

推荐答案

我们不需要 pivot_longer pivot_wider .通过指定 names_to names_sep 自变量

We don't need both pivot_longer and pivot_wider. it can be done within pivot_longer itself by specifying the names_to and the names_sep argument

library(dplyr)
library(tidyr)
test.tbl %>% 
     pivot_longer(cols = everything(), names_to = c('grp', '.value'),
            names_sep = "_") %>%
     select(-grp)

-输出

# A tibble: 40 x 2
#      diff    avg
#     <dbl>  <dbl>
# 1 -0.626   1.51 
# 2  0.919   1.36 
# 3 -0.165   0.398
# 4  2.40    0.476
# 5  0.184   0.390
# 6  0.782  -0.103
# 7 -0.253  -0.612
# 8 -0.0392 -0.710
# 9 -0.836  -0.621
#10  0.0746  0.388
# … with 30 more rows

这篇关于将带有两个不同后缀的数据框列堆叠为两列,最好使用tidyverse的文章就介绍到这了,希望我们推荐的答案对大家有所帮助,也希望大家多多支持IT屋!

查看全文
登录 关闭
扫码关注1秒登录
发送“验证码”获取 | 15天全站免登陆