Pivot_Longer 创建多个组合列 [英] Pivot_Longer to Create Multiple Combined Columns

查看：32 发布时间：2021/9/7 19:29:55 r tidyr

本文介绍了Pivot_Longer 创建多个组合列的处理方法，对大家解决问题具有一定的参考价值，需要的朋友们下面随着小编来一起学习吧！

问题描述

我在其他地方看到了一些关于我的问题的可能讨论，但它要么没有解决，要么我无法完全理解答案是否适用，所以我正在创建一个新问题.

I have seen some possible discussion of my problem elsewhere but it either wasn't resolved or I could not fully understand if the answer applied, so I'm creating a new question.

以下问题特别涉及此主题，但尚未解决.使用 pivot_longer 将宽列收集成多个长列

The following question in particular touches on this subject but is not resolved. Gathering wide columns into multiple long columns using pivot_longer

获取以下示例数据.如您所见，有一个唯一标识符变量，然后是 8 个其他变量.在其他 8 个中，您可以将它们分为两组，gpa 和 percent_a.每个集合都有一个班级、小组、课程和部门值.

Take the following sample data. As you can see there is a unique identifier variable, and then 8 other variables. Of the other 8, you can group them into two sets, gpa and percent_a. For each set there is a class, group, course, and dept value.

在我的实际数据中，我有大约 20 个不同的集合，所有集合都具有相同的结构，每个集合中有相同的四个描述符.

In my actual data I have about 20 different sets, all with the same structure, the same four descriptors in each set.

我想做的是执行一个类似于pivot_longer的功能.除了不是将多个列组合成一组键和值列之外，我的数据中的每个唯一集(班级、组、课程、部门)都将被分组到各自的键/值列中.

What I would like to do is perform a function similar to pivot_longer. Except instead of combining multiple columns into a set of key and value columns, each unique set in my data (class, group, course, dept) would be grouped into there own key/value columns.

set.seed(101)
df <- data.frame(
  id = 1:10,
  class_gpa = rnorm(10, 0, 1),
  course_gpa = rnorm(10, 0, 1),
  group_gpa = rnorm(10, 0, 1),
  dept_gpa = rnorm(10, 0, 1),
  class_percent_a = rnorm(10, 0, 1),
  course_percent_a = rnorm(10, 0, 1),
  group_percent_a = rnorm(10, 0, 1),
  dept_percent_a = rnorm(10, 0, 1)
)

因此，在此示例中，假设我将所有 gpa 值分为两列(gpa_type 和 gpa_value)，将 percent_a 值分为两列(percent_a_type、percent_a_value)，那么最后我只会得到 5列:

So in this example, lets say I group all of the gpa values into two columns (gpa_type, and gpa_value) and the percent_a values into two columns (percent_a_type, percent_a_value), then I would end up at the end with only 5 columns:

id, gpa_type, gpa_value, percent_a_type, percent_a_value

有没有办法做到这一点?使用 pivot_longer 或其他方法.谢谢.

Is there a way to do this? Either with pivot_longer or another method. Thanks.

推荐答案

老实说，我宁愿这样做:

Honestly, I would rather simply do:

df %>% pivot_longer(-id, names_to = c("type", ".value"), names_pattern = "([^_]+)_(.*)")

并将数据保存为更实用的格式:

And keep the data into a more practical format:

# A tibble: 40 x 4
      id type      gpa percent_a
   <int> <chr>   <dbl>     <dbl>
 1     1 class  -0.326     0.482
 2     1 course  0.526    -1.15 
 3     1 group  -0.164    -0.260
 4     1 dept    0.895     1.51 
 5     2 class   0.552     0.758
 6     2 course -0.795    -0.274
 7     2 group   0.709    -1.41 
 8     2 dept    0.279     1.62 
 9     3 class  -0.675    -2.32 
10     3 course  1.43      0.578
# … with 30 more rows

<小时>

为什么要为每个集合"复制类型"属性?

Why duplicate the "type" attribute for each "set"?

对于您想要的输出:

# A tibble: 40 x 5
      id gpa_type gpa_value percent_a_type percent_a_value
   <int> <chr>        <dbl> <chr>                    <dbl>
 1     1 class       -0.326 class                    0.482
 2     1 course       0.526 course                  -1.15 
 3     1 group       -0.164 group                   -0.260
 4     1 dept         0.895 dept                     1.51 
 5     2 class        0.552 class                    0.758
 6     2 course      -0.795 course                  -0.274
 7     2 group        0.709 group                   -1.41 
 8     2 dept         0.279 dept                     1.62 
 9     3 class       -0.675 class                   -2.32 
10     3 course       1.43  course                   0.578
# … with 30 more rows

你可以试试:

lst_df <- df %>%
  gather(key, value, -id) %>%
  extract(key, into = c("var", "type"), "([^_]+)_(.*)") %>%
  split(.$type) 

names(lst_df) %>% 
  map_dfc(~ setNames( 
    lst_df[[.x]] %>% 
      select(-type), 
    c("id", paste0(.x, c("_type", "_value"))))) %>%
  select(-matches("id\\d+"))

这篇关于Pivot_Longer 创建多个组合列的文章就介绍到这了，希望我们推荐的答案对大家有所帮助，也希望大家多多支持IT屋！

查看全文

Pivot_Longer 创建多个组合列 [英] Pivot_Longer to Create Multiple Combined Columns

问题描述

推荐答案

相关文章

其他开发最新文章

热门教程

热门工具

登录关闭

Pivot_Longer 创建多个组合列 [英] Pivot_Longer to Create Multiple Combined Columns

问题描述

推荐答案

相关文章

其他开发最新文章

热门教程

热门工具

登录 关闭

登录关闭