从宽格式到长格式时保留列的顺序 [英] Preserve order of columns when going from wide to long format
问题描述
当我从宽格式到长格式收集列时,我试图保留列的顺序.我遇到的问题是在我 gather
和 summarize
之后,订单丢失了.列数很大,所以我不想手动输入订单.
I'm trying to preserve the order of columns when I gather them from wide to long format. The problem I'm having is after I gather
and summarize
the order is lost. The number of columns is huge so I don't want to manually type the order.
这是一个例子:
library(tidyr)
library(dplyr)
N <- 4
df <- data.frame(sample = c(1,1,2,2),
y1.1 = rnorm(N), y2.1 = rnorm(N), y10.1 = rnorm(N))
> df
sample y1.1 y2.1 y10.1
1 1 1.040938 0.8851727 -0.3617224
2 1 1.175879 1.0009824 -1.1352406
3 2 -1.501832 0.3446469 -1.8687008
4 2 -1.326817 0.4434628 -0.8795962
我想要的是保留列的顺序.在我做了一些操作后,订单丢失了.在这里看到:
What I want is to preserve the order of the columns. After I do some manipulation, the order is lost. Seen here:
dfg <- df %>%
gather(key="key", value="value", -sample) %>%
group_by(sample, key) %>%
summarize(mean = mean(value))
> filter(dfg, sample == 1)
sample key mean
<dbl> <chr> <dbl>
1 1 y1.1 0.2936335
2 1 y10.1 0.6170505
3 1 y2.1 -0.2250543
您可以看到它如何将 y10.1
置于我不想要的 y2.1
之前.我想要的是保留该顺序,见此处:
You can see how it puts y10.1
ahead of y2.1
which I don't want. What I want is to preserve that order, seen here:
dfg <- df %>%
gather(key="key", value="value", -sample)
> filter(dfg, sample == 1)
sample key value
1 1 y1.1 0.60171521
2 1 y1.1 -0.01444823
3 1 y2.1 0.81566726
4 1 y2.1 -1.26577581
5 1 y10.1 0.41686388
6 1 y10.1 0.81723707
出于某种原因,group_by
和 summarize
操作改变了顺序.我不知道为什么.我尝试了 ungroup
命令,但没有任何作用.正如我之前所说,我的实际数据框有很多列,我需要保留顺序.保留顺序的原因是我可以按正确的顺序绘制数据.
For some reason the group_by
and summarize
operations change the order. I'm not sure why. I tried the ungroup
command but that doesn't do anything. As I said earlier, my actual data frame has many columns and I need to preserve the order. The reason to preserve order is so I can plot the data in the correct order.
有什么想法吗?
推荐答案
或者您可以将关键列转换为一个因子,其级别反映原始列名称的顺序:
Or you can convert the key column to a factor with levels reflecting the original column names' order:
df %>%
gather(key="key", value="value", -sample) %>%
mutate(key=factor(key, levels=names(df)[-1])) %>% # add this line to convert the key to a factor
group_by(sample, key) %>%
summarize(mean = mean(value)) %>%
filter(sample == 1)
# A tibble: 3 x 3
# Groups: sample [1]
# sample key mean
# <dbl> <fctr> <dbl>
#1 1 y1.1 0.8310786
#2 1 y2.1 -1.2596933
#3 1 y10.1 0.8208812
这篇关于从宽格式到长格式时保留列的顺序的文章就介绍到这了,希望我们推荐的答案对大家有所帮助,也希望大家多多支持IT屋!