从宽格式到长格式时保留列的顺序 [英] Preserve order of columns when going from wide to long format

查看:26
本文介绍了从宽格式到长格式时保留列的顺序的处理方法,对大家解决问题具有一定的参考价值,需要的朋友们下面随着小编来一起学习吧!

问题描述

当我从宽格式到长格式收集列时,我试图保留列的顺序.我遇到的问题是在我 gathersummarize 之后,订单丢失了.列数很大,所以我不想手动输入订单.

I'm trying to preserve the order of columns when I gather them from wide to long format. The problem I'm having is after I gather and summarize the order is lost. The number of columns is huge so I don't want to manually type the order.

这是一个例子:

library(tidyr)
library(dplyr)

N <- 4
df <- data.frame(sample = c(1,1,2,2),
                 y1.1 = rnorm(N), y2.1 = rnorm(N), y10.1 = rnorm(N))
> df
  sample      y1.1      y2.1      y10.1
1      1  1.040938 0.8851727 -0.3617224
2      1  1.175879 1.0009824 -1.1352406
3      2 -1.501832 0.3446469 -1.8687008
4      2 -1.326817 0.4434628 -0.8795962

我想要的是保留列的顺序.在我做了一些操作后,订单丢失了.在这里看到:

What I want is to preserve the order of the columns. After I do some manipulation, the order is lost. Seen here:

dfg <- df %>% 
  gather(key="key", value="value", -sample) %>%
  group_by(sample, key) %>%
  summarize(mean = mean(value))

> filter(dfg, sample == 1)
  sample   key       mean
   <dbl> <chr>      <dbl>
1      1  y1.1  0.2936335
2      1 y10.1  0.6170505
3      1  y2.1 -0.2250543

您可以看到它如何将 y10.1 置于我不想要的 y2.1 之前.我想要的是保留该顺序,见此处:

You can see how it puts y10.1 ahead of y2.1 which I don't want. What I want is to preserve that order, seen here:

dfg <- df %>% 
  gather(key="key", value="value", -sample)

> filter(dfg, sample == 1)
  sample   key       value
1      1  y1.1  0.60171521
2      1  y1.1 -0.01444823
3      1  y2.1  0.81566726
4      1  y2.1 -1.26577581
5      1 y10.1  0.41686388
6      1 y10.1  0.81723707

出于某种原因,group_bysummarize 操作改变了顺序.我不知道为什么.我尝试了 ungroup 命令,但没有任何作用.正如我之前所说,我的实际数据框有很多列,我需要保留顺序.保留顺序的原因是我可以按正确的顺序绘制数据.

For some reason the group_by and summarize operations change the order. I'm not sure why. I tried the ungroup command but that doesn't do anything. As I said earlier, my actual data frame has many columns and I need to preserve the order. The reason to preserve order is so I can plot the data in the correct order.

有什么想法吗?

推荐答案

或者您可以将关键列转换为一个因子,其级别反映原始列名称的顺序:

Or you can convert the key column to a factor with levels reflecting the original column names' order:

df %>% 
    gather(key="key", value="value", -sample) %>%
    mutate(key=factor(key, levels=names(df)[-1])) %>% # add this line to convert the key to a factor
    group_by(sample, key) %>%
    summarize(mean = mean(value)) %>%
    filter(sample == 1)

# A tibble: 3 x 3
# Groups:   sample [1]
#  sample    key       mean
#   <dbl> <fctr>      <dbl>
#1      1   y1.1  0.8310786
#2      1   y2.1 -1.2596933
#3      1  y10.1  0.8208812

这篇关于从宽格式到长格式时保留列的顺序的文章就介绍到这了,希望我们推荐的答案对大家有所帮助,也希望大家多多支持IT屋!

查看全文
登录 关闭
扫码关注1秒登录
发送“验证码”获取 | 15天全站免登陆