使用R将数据集中的多列重新排列为一列 [英] Re-arrange multiple columns in a data set into one column using R
问题描述
我想将其中一个数据集的三列合并为一个变量名称为"al_anim"的列,并删除所有重复项,将值(动物ID)从最低到最高排列,然后将每只动物的编号从1重新编号为N在变量名"new_id"下.
I would like to combine three columns in one of my data sets into one with variable name "al_anim" and remove any duplicates, rank the values (animal ids) from lowest to highest, and re-number each animal from 1 to N under the variable name "new_id".
anim1 <- c(1456,2569,5489,1456,4587)
anim2 <- c(6531,6987,6987,15487,6531)
anim3 <- c(4587,6548,7894,3215,8542)
mydf <- data.frame(anim1,anim2,anim3)
任何帮助将不胜感激!
巴兹
推荐答案
在您的示例中使用mydf
:
mydf <- data.frame(anim1, anim2, anim3)
堆叠数据:
sdf <- stack(mydf)
然后使用unique()
uni <- unique(sdf[, "values"])
然后这将为他们提供一个新的动物ID
and then this will get them a new animal id
new_id <- as.numeric(as.factor(sort(uni)))
这将给出:
> new_id
[1] 1 2 3 4 5 6 7 8 9 10 11
然而,那完全是微不足道的; seq_along(uni)
使您轻松到达那里.所以我想知道你是否想要
However that is totally trivial; seq_along(uni)
gets you there far more easily. So I wonder if you want
newdf <- data.frame(anim = sort(uni), new_id = seq_along(uni))
merge(sdf, newdf, by.x = "values", by.y = "anim")
给出:
> merge(sdf, newdf, by.x = "values", by.y = "anim")
values ind new_id
1 1456 anim1 1
2 1456 anim1 1
3 2569 anim1 2
4 3215 anim3 3
5 4587 anim1 4
6 4587 anim3 4
7 5489 anim1 5
8 6531 anim2 6
9 6531 anim2 6
10 6548 anim3 7
11 6987 anim2 8
12 6987 anim2 8
13 7894 anim3 9
14 8542 anim3 10
15 15487 anim2 11
您的问题中存在一定的歧义,可以通过提供预期的结果/输出来缓解这种歧义.
There is an amount of ambiguity in your Question which could be alleviated by giving an expected result/output.
这篇关于使用R将数据集中的多列重新排列为一列的文章就介绍到这了,希望我们推荐的答案对大家有所帮助,也希望大家多多支持IT屋!