从R中的字符串中提取各种格式的日期 [英] Extract dates in various formats from string in R

查看:50
本文介绍了从R中的字符串中提取各种格式的日期的处理方法,对大家解决问题具有一定的参考价值,需要的朋友们下面随着小编来一起学习吧!

问题描述

我需要快速从字符向量中提取日期.我有2个主要问题:

I need to quickly extract dates from character vectors. I have 2 main issues:

  • 各种日期格式(欧洲和美国,字母数字和数字...)
  • 每个向量中有多个日期.

我的向量如下:

c("11/09/2016 Invoice Number . Date P.O. # Amount Discount Paid Amount 2017/015 10/28/2016 CC6/ $50,000.00 $0.00 $50,000-00 2017/016 10/28/2016 CC67 $50,000.00 $0.00 $50,000-00 2017-017 10/28/2016 CC67 $50,000.00 . $0.00 $50,000.00 TOTALS: $150,000.00 $0.00 $150,000.00     ")

我尝试使用 parse_date strptime 失败.我对正则表达式语法一无所知,也没有时间深入研究它.

I have tried using parse_date and strptime without success. I do not know anything about the regex syntax and do not really have time to dig into it.

衷心感谢您的帮助.

推荐答案

如果需要R个日期,则需要选择是否为更多美国或欧洲日期赋值

If you need R dates, you will need to choose if you value more American or European dates

library(tidyverse)
library(lubridate)
#> 
#> Attaching package: 'lubridate'
#> The following object is masked from 'package:base':
#> 
#>     date


v1 <-  c("11/09/2016 Invoice Number . Date P.O. # Amount Discount Paid Amount 2017/015 10/28/2016 CC6/ $50,000.00 $0.00 $50,000-00 2017/016 10/28/2016 CC67 $50,000.00 $0.00 $50,000-00 2017-017 10/28/2016 CC67 $50,000.00 . $0.00 $50,000.00 TOTALS: $150,000.00 $0.00 $150,000.00")

str_extract_all(v1, "\\d{2}/\\d{2}/\\d{4}")[[1]] %>% 
  tibble(value = .) %>% 
  mutate(american_date = value %>% mdy,
         european_date = value %>% dmy,
         stronger_american = coalesce(american_date,european_date),
         stronger_european = coalesce(european_date,american_date))
#> Warning: 3 failed to parse.
#> # A tibble: 4 x 5
#>   value      american_date european_date stronger_american stronger_european
#>   <chr>      <date>        <date>        <date>            <date>           
#> 1 11/09/2016 2016-11-09    2016-09-11    2016-11-09        2016-09-11       
#> 2 10/28/2016 2016-10-28    NA            2016-10-28        2016-10-28       
#> 3 10/28/2016 2016-10-28    NA            2016-10-28        2016-10-28       
#> 4 10/28/2016 2016-10-28    NA            2016-10-28        2016-10-28

reprex软件包(v0.3.0)创建于2020-01-06 sup>

Created on 2020-01-06 by the reprex package (v0.3.0)

这篇关于从R中的字符串中提取各种格式的日期的文章就介绍到这了,希望我们推荐的答案对大家有所帮助,也希望大家多多支持IT屋!

查看全文
登录 关闭
扫码关注1秒登录
发送“验证码”获取 | 15天全站免登陆