订购一份“混合"的向量(带有字母的数字) [英] Order a "mixed" vector (numbers with letters)
问题描述
如何订购
c("7","10a","10b","10c","8","9","11c","11b","11a","12") -> alph
在
alph
[1] "7","8","9","10a","10b","10c","11a","11b","11c","12"
并使用它对data.frame进行排序,例如
and use it to sort a data.frame, like
V1 <- c("A","A","B","B","C","C","D","D","E","E")
V2 <- 2:1
V3 <- alph
df <- data.frame(V1,V2,V3)
并对行进行排序(先按V2,然后按V3)
and order the row to obtain (order V2 and then V3)
V1 V2 V3
C 1 9
A 1 10a
B 1 10c
D 1 11b
E 1 12
A 2 7
C 2 8
B 2 10b
E 2 11a
D 2 11c
推荐答案
> library(gtools)
> mixedsort(alph)
[1] "7" "8" "9" "10a" "10b" "10c" "11a" "11b" "11c" "12"
要对数据框进行排序,请改用mixedorder
To sort a data.frame you use mixedorder
instead
> mydf <- data.frame(alph, USArrests[seq_along(alph),])
> mydf[mixedorder(mydf$alph),]
alph Murder Assault UrbanPop Rape
Alabama 7 13.2 236 58 21.2
California 8 9.0 276 91 40.6
Colorado 9 7.9 204 78 38.7
Alaska 10a 10.0 263 48 44.5
Arizona 10b 8.1 294 80 31.0
Arkansas 10c 8.8 190 50 19.5
Florida 11a 15.4 335 80 31.9
Delaware 11b 5.9 238 72 15.8
Connecticut 11c 3.3 110 77 11.1
Georgia 12 17.4 211 60 25.8
mixedorder
在多个向量(列)上
显然,mixedorder
无法处理多个向量.通过将所有字符向量转换为具有混合排序水平的元素 并将所有向量传递给标准order
函数,我制作了一个可以解决此问题的函数.
mixedorder
on multiple vectors (columns)
Apparently mixedorder
cannot handle multiple vectors. I have made a function that circumvents this by converting all character vectors to factors with mixedsorted sorted levels, and pass all vectors on to the standard order
function.
multi.mixedorder <- function(..., na.last = TRUE, decreasing = FALSE){
do.call(order, c(
lapply(list(...), function(l){
if(is.character(l)){
factor(l, levels=mixedsort(unique(l)))
} else {
l
}
}),
list(na.last = na.last, decreasing = decreasing)
))
}
但是,在特定的情况下,由于V2
是数字,因此multi.mixedorder
获得的结果与标准order
相同.
However, in your particular case multi.mixedorder
gets you the same result as the standard order
, since V2
is numeric.
df <- data.frame(
V1 = c("A","A","B","B","C","C","D","D","E","E"),
V2 = 19:10,
V3 = alph,
stringsAsFactors = FALSE)
df[multi.mixedorder(df$V2, df$V3),]
V1 V2 V3
10 E 10 12
9 E 11 11a
8 D 12 11b
7 D 13 11c
6 C 14 9
5 C 15 8
4 B 16 10c
3 B 17 10b
2 A 18 10a
1 A 19 7
注意
-
19:10
等效于c(19:10)
.c
的意思是 concat ,即从多个空头中提取一个长向量,但是在您的情况下,您只有一个向量(19:10
),因此无需连接任何东西.但是,在V1
的情况下,您有10个长度为1的向量,因此您需要像以前一样连接. - 您需要
stringsAsFactors=FALSE
不能将V1
和V3
转换为(错误排序)因子(默认设置). 19:10
is equivalent toc(19:10)
.c
means concat, that is to make one long vector out of many short, but in you case you only have one vector (19:10
) so there's no need to concat anything. However, in the case ofV1
you have 10 vectors of length 1, so there you need to concat, as you already do.- You need
stringsAsFactors=FALSE
to not convertV1
andV3
to (incorrectly sorted) factors (which is default).
Notice that
这篇关于订购一份“混合"的向量(带有字母的数字)的文章就介绍到这了,希望我们推荐的答案对大家有所帮助,也希望大家多多支持IT屋!