使用sparklyr将字符串/字符转换为日期 [英] Converting string/chr to date using sparklyr

查看:546
本文介绍了使用sparklyr将字符串/字符转换为日期的处理方法,对大家解决问题具有一定的参考价值,需要的朋友们下面随着小编来一起学习吧!

问题描述

我把一张表带入Hue,里面有一列日期,我正试着在Rstudio中用sparklyr玩它。



我想将字符列转换为日期列,如下所示:

  Weather_data = mutate(Weather_data,date2 = as.Date(date, %m /%d /%Y))

这个运行良好,但是当我检查:

  head(Weather_data)



如何正确地将chr转换为日期?



谢谢!!!!

解决方案

问题在于 sparklyr 不能正确支持Spark DateType 。可以解析日期和正确格式,但不能将它们表示为适当的 DateType 列。如果这足够了,请按照下面的说明操作。



在Spark 2.2或更高版本中,使用Java <$ c使用 to_date $ c $> SimpleDataFormat 兼容字符串:

  df < -  copy_to(sc,data.frame date = c(01/01/2010)))
parsed< - df%>%mutate(date_parsed = to_date(date,MM / dd / yyyy))
parsed



 #Source:lazy query [ ?? x 2] 
#数据库:spark_connection
日期date_parsed
< chr> < CHR>
1 01/15/2010 2010-01-15

有趣的是,内部Spark对象仍然使用 DateType 列:

 解析%>%spark_dataframe 



 < jobj [120]> 
class org.apache.spark.sql.Dataset
[date:string,date_parsed:date]

对于早期版本 unix_timestamp cast (但注意可能的时区问题):

  df%>%
mutate(date_parsed = sql(
CAST(CAST(unix_timestamp(date,' MM / dd / yyyy')AS timestamp)AS date)))



 #来源:lazy query [?? x 2] 
#数据库:spark_connection
日期date_parsed
< chr> < CHR>
1 01/15/2010 2010-01-15

编辑 strong>:

看起来这个问题已经在当前的master上解决了( sparklyr_0.7.0-9105 ):

 #来源:lazy query [?? x 2] 
#数据库:spark_connection
日期date_parsed
< chr> <日期>
1 01/01/2010 2009-12-31


I've brought a table into Hue which has a column of dates and i'm trying to play with it using sparklyr in Rstudio.

I'd like to convert a character column into a date column like so:

Weather_data = mutate(Weather_data, date2 = as.Date(date, "%m/%d/%Y"))

and this runs fine but when i check:

head(Weather_data) 

How to I properly convert the chr to dates?

Thanks!!!!

解决方案

The problem is that sparklyr doesn't correctly support Spark DateType. It is possible to parse dates, and correct format, but not represent these as proper DateType columns. If that's enough then please follow the instructions below.

In Spark 2.2 or later use to_date with Java SimpleDataFormat compatible string:

df <- copy_to(sc, data.frame(date=c("01/01/2010")))
parsed <- df %>% mutate(date_parsed = to_date(date, "MM/dd/yyyy"))
parsed

# Source:   lazy query [?? x 2]
# Database: spark_connection
        date date_parsed
       <chr>       <chr>
1 01/15/2010  2010-01-15

Interestingly internal Spark object still uses DateType columns:

parsed %>% spark_dataframe

<jobj[120]>
  class org.apache.spark.sql.Dataset
  [date: string, date_parsed: date]

For earlier versions unix_timestamp and cast (but watch for possible timezone problems):

df %>%
  mutate(date_parsed = sql(
    "CAST(CAST(unix_timestamp(date, 'MM/dd/yyyy') AS timestamp) AS date)"))

# Source:   lazy query [?? x 2]
# Database: spark_connection
        date date_parsed
       <chr>       <chr>
1 01/15/2010  2010-01-15

Edit:

It looks like this problem has been resolved on current master (sparklyr_0.7.0-9105):

# Source:   lazy query [?? x 2]
# Database: spark_connection
        date date_parsed
       <chr>      <date>
1 01/01/2010  2009-12-31

这篇关于使用sparklyr将字符串/字符转换为日期的文章就介绍到这了,希望我们推荐的答案对大家有所帮助,也希望大家多多支持IT屋!

查看全文
登录 关闭
扫码关注1秒登录
发送“验证码”获取 | 15天全站免登陆