按两列排名并保持联系 [英] Rank by two columns and keep ties

查看：69 发布时间：2020/10/15 20:43:06 r data.table dplyr rank

本文介绍了按两列排名并保持联系的处理方法，对大家解决问题具有一定的参考价值，需要的朋友们下面随着小编来一起学习吧！

问题描述

我的问题是链接

我有一个这样的数据集：

I have a dataset such as this one:

 ID    |     Date 

  A        01/01/2015
  A        02/01/2015
  A        02/01/2015
  A        02/01/2015
  A        05/01/2015     
  B        01/01/2015

我想对每个日期进行排名在推荐日期之前-2015年1月31日。与参考日期最接近的日期排在第1位，第二位，依此类推。结果如下：

I want to rank each date by a referential date - 31/01/2015. The closest date to the referential date being ranked 1, second 2, and so on. The result would look like:

  ID    |     Date           |  Sequence

  A        01/01/2015           3
  A        02/01/2015           2
  A        02/01/2015           2
  A        02/01/2015           2
  A        05/01/2015           1  
  B        01/01/2015          ...

虽然rank函数确实认为，但我也想保持所有联系。我怎么做？

While the rank function does think, I also want to keep all the ties. How do I do that?

此外，我正在处理一个巨大的数据集-大约3亿行。因此，理想的解决方案是快速。

Also, I am working with a huge dataset - approx. 300 million rows. So the solution would ideally be fast.

推荐答案

我们可以使用数据中的 frank .table ，其中密集作为 ties.method ，在 abs 日期与参考日期（'2015-01-31'）之间的差额

We can use frank from data.table with dense as ties.method after grouping by 'ID' on the absolute difference between the 'Date' and the reference date ('2015-01-31')

library(data.table)
setDT(df)[, Sequence := frank(abs(as.IDate(Date, "%d/%m/%Y")- 
              as.IDate("2015-01-31")), ties.method = "dense"), by = ID]
df
#    ID       Date Sequence
#1:  A 01/01/2015        3
#2:  A 02/01/2015        2
#3:  A 02/01/2015        2
#4:  A 02/01/2015        2
#5:  A 05/01/2015        1
#6:  B 01/01/2015        1

data

df <- structure(list(ID = c("A", "A", "A", "A", "A", "B"), Date = c("01/01/2015", 
 "02/01/2015", "02/01/2015", "02/01/2015", "05/01/2015", "01/01/2015"
)), .Names = c("ID", "Date"), class = "data.frame", row.names = c(NA, 
-6L))

这篇关于按两列排名并保持联系的文章就介绍到这了，希望我们推荐的答案对大家有所帮助，也希望大家多多支持IT屋！

查看全文

按两列排名并保持联系 [英] Rank by two columns and keep ties

问题描述

推荐答案

data

data

相关文章

其他开发最新文章

热门教程

热门工具

登录关闭

按两列排名并保持联系 [英] Rank by two columns and keep ties

问题描述

推荐答案

data

data

相关文章

其他开发最新文章

热门教程

热门工具

登录 关闭

登录关闭