订购“混合"矢量(带字母的数字) [英] Order a "mixed" vector (numbers with letters)
问题描述
如何订购像
c("7","10a","10b","10c","8","9","11c","11b","11a","12") -> alph
在
alph
[1] "7","8","9","10a","10b","10c","11a","11b","11c","12"
并使用它对 data.frame 进行排序,例如
and use it to sort a data.frame, like
V1 <- c("A","A","B","B","C","C","D","D","E","E")
V2 <- 2:1
V3 <- alph
df <- data.frame(V1,V2,V3)
并排序要获得的行(排序V2然后V3)
and order the row to obtain (order V2 and then V3)
V1 V2 V3
C 1 9
A 1 10a
B 1 10c
D 1 11b
E 1 12
A 2 7
C 2 8
B 2 10b
E 2 11a
D 2 11c
推荐答案
> library(gtools)
> mixedsort(alph)
[1] "7" "8" "9" "10a" "10b" "10c" "11a" "11b" "11c" "12"
要对 data.frame 进行排序,请使用 mixedorder
代替
To sort a data.frame you use mixedorder
instead
> mydf <- data.frame(alph, USArrests[seq_along(alph),])
> mydf[mixedorder(mydf$alph),]
alph Murder Assault UrbanPop Rape
Alabama 7 13.2 236 58 21.2
California 8 9.0 276 91 40.6
Colorado 9 7.9 204 78 38.7
Alaska 10a 10.0 263 48 44.5
Arizona 10b 8.1 294 80 31.0
Arkansas 10c 8.8 190 50 19.5
Florida 11a 15.4 335 80 31.9
Delaware 11b 5.9 238 72 15.8
Connecticut 11c 3.3 110 77 11.1
Georgia 12 17.4 211 60 25.8
mixedorder
在多个向量(列)上
显然 mixedorder
不能处理多个向量.我创建了一个函数,通过将所有字符向量转换为具有混合排序级别的因子,并将所有向量传递给标准的 order
函数.
mixedorder
on multiple vectors (columns)
Apparently mixedorder
cannot handle multiple vectors. I have made a function that circumvents this by converting all character vectors to factors with mixedsorted sorted levels, and pass all vectors on to the standard order
function.
multi.mixedorder <- function(..., na.last = TRUE, decreasing = FALSE){
do.call(order, c(
lapply(list(...), function(l){
if(is.character(l)){
factor(l, levels=mixedsort(unique(l)))
} else {
l
}
}),
list(na.last = na.last, decreasing = decreasing)
))
}
但是,在您的特定情况下,multi.mixedorder
为您提供与标准 order
相同的结果,因为 V2
是数字.>
However, in your particular case multi.mixedorder
gets you the same result as the standard order
, since V2
is numeric.
df <- data.frame(
V1 = c("A","A","B","B","C","C","D","D","E","E"),
V2 = 19:10,
V3 = alph,
stringsAsFactors = FALSE)
df[multi.mixedorder(df$V2, df$V3),]
V1 V2 V3
10 E 10 12
9 E 11 11a
8 D 12 11b
7 D 13 11c
6 C 14 9
5 C 15 8
4 B 16 10c
3 B 17 10b
2 A 18 10a
1 A 19 7
注意
19:10
等价于c(19:10)
.c
的意思是 concat,也就是把多个短向量变成一个长向量,但在你的情况下你只有一个向量 (19:10
) 所以没有必要连接任何东西.但是,在V1
的情况下,您有 10 个长度为 1 的向量,因此您需要像之前一样进行连接.- 您需要
stringsAsFactors=FALSE
才能将V1
和V3
转换为(错误排序的)因子(这是默认值). 19:10
is equivalent toc(19:10)
.c
means concat, that is to make one long vector out of many short, but in you case you only have one vector (19:10
) so there's no need to concat anything. However, in the case ofV1
you have 10 vectors of length 1, so there you need to concat, as you already do.- You need
stringsAsFactors=FALSE
to not convertV1
andV3
to (incorrectly sorted) factors (which is default).
Notice that
这篇关于订购“混合"矢量(带字母的数字)的文章就介绍到这了,希望我们推荐的答案对大家有所帮助,也希望大家多多支持IT屋!