从R中的给定数据集中选择最早的日期 [英] Selecting the earliest date from a given dataset in R

查看:256
本文介绍了从R中的给定数据集中选择最早的日期的处理方法,对大家解决问题具有一定的参考价值,需要的朋友们下面随着小编来一起学习吧!

问题描述

我有一个包含许多行的数据集,但是我只选择了几行,如下所示,并且只需要选择最早的SORT_DT,其余所有变量都保持不变.

I have a data set with many rows but I picked only a few as shown below and need to pick only the earliest SORT_DT among all with the rest all variables remaining the same.

        CUST_NO ID_NO SYMBOL  AUTO_CREATE_DT     CLASS_TYPE    SORT_DT
1         107   10120      1    2014-05-12             G/L  2015-01-09
2         107   10120      1    2014-05-12             G/L  2015-11-10
3         107   10120      1    2014-05-12             G/L  2014-06-18
4         107   10120      1    2014-05-12             G/L  2014-05-12
5         107   10120      1    2014-05-12             G/L  2015-07-10
6         107   10120      1    2014-05-12             G/L  2015-10-09
7         107   10120      1    2014-05-12             G/L  2016-04-08
8         107   10120      1    2014-05-12             G/L  2016-01-08
9         107   10120      1    2014-05-12             G/L  2016-12-22
10        107   10120      1    2014-05-12             G/L  2017-01-13
11        107   10120      1    2014-05-12             G/L  2016-07-08
12        107   10120      1    2014-05-12             G/L  2017-04-14
13        107   10120      1    2014-05-12             G/L  2017-04-17
14        107   10120      1    2014-05-12             G/L  2016-08-31
15        107   10120      1    2014-05-12             G/L  2015-04-10
16        107   10120      1    2014-05-12             G/L  2016-12-22

我需要输出格式为

      CUST_NO   ID_NO      SYMBOL  AUTO_CREATE_DT     CLASS_TYPE    SORT_DT
1         107     10120      1    2014-05-12             G/L     2014-05-12

请让我知道是否有人对此有解决方案.

Please let me know if anyone has a solution for this.

我还添加了新数据集

df <- fread("CUST_NO ID_NO SYMBOL  AUTO_CREATE_DT     CLASS_TYPE    SORT_DT
         107   10120      1    2014-05-12             G/L  2015-01-09
        107   10120      1    2014-05-12             G/L  2015-11-10
        107   10120      1    2014-05-12             G/L  2014-06-18
        107   10120      1    2014-05-12             G/L  2014-05-13
        107   10120      1    2014-05-12             G/L  2015-07-10
        107   10120      1    2014-05-12             G/L  2015-10-09
        107   10120      1    2014-05-12             G/L  2016-04-08
        107   10120      1    2014-05-12             G/L  2016-01-08
        107   10120      1    2014-05-12             G/L  2016-12-22
        107   10120      1    2014-05-12             G/L  2017-01-13
        107   10120      1    2014-05-12             G/L  2016-07-08
        108   10120      1    2014-05-12             G/L  2017-04-14
        108   10120      1    2014-05-12             G/L  2017-04-17
        108   10120      1    2014-05-12             G/L  2016-08-31
        108   10120      1    2014-05-12             G/L  2015-04-10
        108   10120      1    2014-05-12             G/L  2016-12-22")

输出应如下所示

  CUST_NO   ID_NO      SYMBOL  AUTO_CREATE_DT     CLASS_TYPE    SORT_DT
1         107     10120      1    2014-05-12             G/L     2014-05-13
2         108     10120      1    2014-05-12             G/L     2015-04-10    

推荐答案

尝试一下:

aggregate(SORT_DT~.,min,data=df)

输出:

  CUST_NO ID_NO SYMBOL AUTO_CREATE_DT CLASS_TYPE    SORT_DT
1     107 10120      1     2014-05-12        G/L 2014-05-13
2     108 10120      1     2014-05-12        G/L 2015-04-10

这篇关于从R中的给定数据集中选择最早的日期的文章就介绍到这了,希望我们推荐的答案对大家有所帮助,也希望大家多多支持IT屋!

查看全文
登录 关闭
扫码关注1秒登录
发送“验证码”获取 | 15天全站免登陆