在特定列上排序时如何分区? [英] How to partition when ranking on a particular column?

查看:126
本文介绍了在特定列上排序时如何分区?的处理方法,对大家解决问题具有一定的参考价值,需要的朋友们下面随着小编来一起学习吧!

问题描述

全部:



我有一个如下所示的数据框架。我知道我可以像这样执行全局排序:

  dt<  -  data.frame(
ID = c('A1','A2','A4','A2','A1' ,'A4','A3','A2','A1','A3'),
Value = c(4,3,1,3,4,6,6,1,8,4)
);
> dt
ID值
1 A1 4
2 A2 3
3 A4 1
4 A2 3
5 A1 4
6 A4 6
7 A3 6
8 A2 1
9 A1 8
10 A3 4
dt $ Order< - rank(dt $ Value,ties.method =first )
> dt
ID值订单
1 A1 4 5
2 A2 3 3
3 A4 1 1
4 A2 3 4
5 A1 4 6
6 A4 6 8
7 A3 6 9
8 A2 1 2
9 A1 8 10
10 A3 4 7
/ pre>

但是,如何设置特定ID而不是全局排序的排序。我该如何完成这项工作?在T-SQL中,我们可以通过以下语法完成此操作:

  RANK()OVER([< partition_by_clause> ]< order_by_clause>)

任何想法?

解决方案

我的方式,但有可能更好。从来没有用过,甚至不了解它。谢谢,可能是有用的。

 #你的数据
dt< - data.frame(
ID = c('A1','A2','A4','A2','A1','A4','A3','A2','A1','A3'),
值= c(4,3,1,3,4,6,6,1,8,4)

dt $ Order< - rank(dt $ Value,ties.method =first )

#My方法
dt $ id < - 1:nrow(dt)#needed用于排序并将其放在一起
dt < - dt [order(dt ($,$,$)$]
dt $ Order.by.group< - unlist(with(dt,tapply(Value,ID,function(x)rank(x,
ties.method =first ))))
dt [order(dt $ id),-4]

收益:

  ID值订单Order.by.group 
1 A1 4 5 1
2 A2 3 3 2
3 A4 1 1 1
4 A2 3 4 3
5 A1 4 6 2
6 A4 6 8 2
7 A3 6 9 2
8 A2 1 2 1
9 A1 8 10 3
10 A3 4 7 1

编辑:



如果您不关心保留数据的原始顺序,则可以使用较少的代码:

  dt<  -  dt [order(dt $ ID),] 
dt $ Order.by.group< - unlist(with(dt,tapply(Value,函数(x)等级(x,
ties.method =first)))

ID值Order.byGroup
1 A1 4 1
5 A1 4 2
9 A1 8 3
2 A2 3 2
4 A2 3 3
8 A2 1 1 b $ b 7 A3 6 2
10 A3 4 1
3 A4 1 1
6 A4 6 2


All:

I have a data frame like the follow.I know I can do a global rank order like this:

dt <- data.frame(
    ID = c('A1','A2','A4','A2','A1','A4','A3','A2','A1','A3'),
    Value = c(4,3,1,3,4,6,6,1,8,4)
);
> dt
   ID Value
1  A1     4
2  A2     3
3  A4     1
4  A2     3
5  A1     4
6  A4     6
7  A3     6
8  A2     1
9  A1     8
10 A3     4
dt$Order <- rank(dt$Value,ties.method= "first")
> dt
   ID Value Order
1  A1     4     5
2  A2     3     3
3  A4     1     1
4  A2     3     4
5  A1     4     6
6  A4     6     8
7  A3     6     9
8  A2     1     2
9  A1     8    10
10 A3     4     7

But how can I set a rank order for a particular ID instead of a global rank order. How can I get this done? In T-SQL, we can get this done as the following syntax:

RANK() OVER ( [ < partition_by_clause > ] < order_by_clause > )

Any idea?

解决方案

My way but there's likely better. Never used rank, din't even know about it. Thanks, may be useful.

#Your Data
dt <- data.frame(
    ID = c('A1','A2','A4','A2','A1','A4','A3','A2','A1','A3'),
    Value = c(4,3,1,3,4,6,6,1,8,4)
)
dt$Order <- rank(dt$Value,ties.method= "first")

#My approach
dt$id <- 1:nrow(dt) #needed for ordering and putting things back together
dt <- dt[order(dt$ID),]
dt$Order.by.group <- unlist(with(dt, tapply(Value, ID, function(x) rank(x, 
    ties.method = "first"))))
dt[order(dt$id), -4]

Yields:

   ID Value Order Order.by.group
1  A1     4     5              1
2  A2     3     3              2
3  A4     1     1              1
4  A2     3     4              3
5  A1     4     6              2
6  A4     6     8              2
7  A3     6     9              2
8  A2     1     2              1
9  A1     8    10              3
10 A3     4     7              1

EDIT:

If you don't care about preserving the original order of the data then this works with less code:

dt <- dt[order(dt$ID),]
dt$Order.by.group <- unlist(with(dt, tapply(Value, ID, function(x) rank(x, 
   ties.method= "first"))))

   ID Value Order.by.group
1  A1     4              1
5  A1     4              2
9  A1     8              3
2  A2     3              2
4  A2     3              3
8  A2     1              1
7  A3     6              2
10 A3     4              1
3  A4     1              1
6  A4     6              2

这篇关于在特定列上排序时如何分区?的文章就介绍到这了,希望我们推荐的答案对大家有所帮助,也希望大家多多支持IT屋!

查看全文
登录 关闭
扫码关注1秒登录
发送“验证码”获取 | 15天全站免登陆