在特定列上排序时如何分区? [英] How to partition when ranking on a particular column?
本文介绍了在特定列上排序时如何分区?的处理方法,对大家解决问题具有一定的参考价值,需要的朋友们下面随着小编来一起学习吧!
问题描述
全部:
我有一个如下所示的数据框架。我知道我可以像这样执行全局排序:
dt< - data.frame(
/ pre>
ID = c('A1','A2','A4','A2','A1' ,'A4','A3','A2','A1','A3'),
Value = c(4,3,1,3,4,6,6,1,8,4)
);
> dt
ID值
1 A1 4
2 A2 3
3 A4 1
4 A2 3
5 A1 4
6 A4 6
7 A3 6
8 A2 1
9 A1 8
10 A3 4
dt $ Order< - rank(dt $ Value,ties.method =first )
> dt
ID值订单
1 A1 4 5
2 A2 3 3
3 A4 1 1
4 A2 3 4
5 A1 4 6
6 A4 6 8
7 A3 6 9
8 A2 1 2
9 A1 8 10
10 A3 4 7
但是,如何设置特定ID而不是全局排序的排序。我该如何完成这项工作?在T-SQL中,我们可以通过以下语法完成此操作:
RANK()OVER([< partition_by_clause> ]< order_by_clause>)
任何想法?
解决方案我的方式,但有可能更好。从来没有用过,甚至不了解它。谢谢,可能是有用的。
#你的数据
dt< - data.frame(
ID = c('A1','A2','A4','A2','A1','A4','A3','A2','A1','A3'),
值= c(4,3,1,3,4,6,6,1,8,4)
)
dt $ Order< - rank(dt $ Value,ties.method =first )
#My方法
dt $ id < - 1:nrow(dt)#needed用于排序并将其放在一起
dt < - dt [order(dt ($,$,$)$]
dt $ Order.by.group< - unlist(with(dt,tapply(Value,ID,function(x)rank(x,
ties.method =first ))))
dt [order(dt $ id),-4]
收益:
ID值订单Order.by.group
1 A1 4 5 1
2 A2 3 3 2
3 A4 1 1 1
4 A2 3 4 3
5 A1 4 6 2
6 A4 6 8 2
7 A3 6 9 2
8 A2 1 2 1
9 A1 8 10 3
10 A3 4 7 1
编辑:
如果您不关心保留数据的原始顺序,则可以使用较少的代码:
dt< - dt [order(dt $ ID),]
dt $ Order.by.group< - unlist(with(dt,tapply(Value,函数(x)等级(x,
ties.method =first)))
ID值Order.byGroup
1 A1 4 1
5 A1 4 2
9 A1 8 3
2 A2 3 2
4 A2 3 3
8 A2 1 1 b $ b 7 A3 6 2
10 A3 4 1
3 A4 1 1
6 A4 6 2
All:
I have a data frame like the follow.I know I can do a global rank order like this:
dt <- data.frame( ID = c('A1','A2','A4','A2','A1','A4','A3','A2','A1','A3'), Value = c(4,3,1,3,4,6,6,1,8,4) ); > dt ID Value 1 A1 4 2 A2 3 3 A4 1 4 A2 3 5 A1 4 6 A4 6 7 A3 6 8 A2 1 9 A1 8 10 A3 4 dt$Order <- rank(dt$Value,ties.method= "first") > dt ID Value Order 1 A1 4 5 2 A2 3 3 3 A4 1 1 4 A2 3 4 5 A1 4 6 6 A4 6 8 7 A3 6 9 8 A2 1 2 9 A1 8 10 10 A3 4 7
But how can I set a rank order for a particular ID instead of a global rank order. How can I get this done? In T-SQL, we can get this done as the following syntax:
RANK() OVER ( [ < partition_by_clause > ] < order_by_clause > )
Any idea?
解决方案My way but there's likely better. Never used rank, din't even know about it. Thanks, may be useful.
#Your Data dt <- data.frame( ID = c('A1','A2','A4','A2','A1','A4','A3','A2','A1','A3'), Value = c(4,3,1,3,4,6,6,1,8,4) ) dt$Order <- rank(dt$Value,ties.method= "first") #My approach dt$id <- 1:nrow(dt) #needed for ordering and putting things back together dt <- dt[order(dt$ID),] dt$Order.by.group <- unlist(with(dt, tapply(Value, ID, function(x) rank(x, ties.method = "first")))) dt[order(dt$id), -4]
Yields:
ID Value Order Order.by.group 1 A1 4 5 1 2 A2 3 3 2 3 A4 1 1 1 4 A2 3 4 3 5 A1 4 6 2 6 A4 6 8 2 7 A3 6 9 2 8 A2 1 2 1 9 A1 8 10 3 10 A3 4 7 1
EDIT:
If you don't care about preserving the original order of the data then this works with less code:
dt <- dt[order(dt$ID),] dt$Order.by.group <- unlist(with(dt, tapply(Value, ID, function(x) rank(x, ties.method= "first")))) ID Value Order.by.group 1 A1 4 1 5 A1 4 2 9 A1 8 3 2 A2 3 2 4 A2 3 3 8 A2 1 1 7 A3 6 2 10 A3 4 1 3 A4 1 1 6 A4 6 2
这篇关于在特定列上排序时如何分区?的文章就介绍到这了,希望我们推荐的答案对大家有所帮助,也希望大家多多支持IT屋!
查看全文