使用sparklyr将字符串/字符转换为日期 [英] Converting string/chr to date using sparklyr
问题描述
我把一张表带入Hue,里面有一列日期,我正试着在Rstudio中用sparklyr玩它。
我想将字符列转换为日期列,如下所示:
Weather_data = mutate(Weather_data,date2 = as.Date(date, %m /%d /%Y))
这个运行良好,但是当我检查:
head(Weather_data)
如何正确地将chr转换为日期?
谢谢!!!!
问题在于 sparklyr
不能正确支持Spark DateType
。可以解析日期和正确格式,但不能将它们表示为适当的 DateType
列。如果这足够了,请按照下面的说明操作。
在Spark 2.2或更高版本中,使用Java <$ c使用 to_date
$ c $> SimpleDataFormat 兼容字符串:
df < - copy_to(sc,data.frame date = c(01/01/2010)))
parsed< - df%>%mutate(date_parsed = to_date(date,MM / dd / yyyy))
parsed
#Source:lazy query [ ?? x 2]
#数据库:spark_connection
日期date_parsed
< chr> < CHR>
1 01/15/2010 2010-01-15
有趣的是,内部Spark对象仍然使用 DateType
列:
解析%>%spark_dataframe
< jobj [120]>
class org.apache.spark.sql.Dataset
[date:string,date_parsed:date]
对于早期版本 unix_timestamp
和 cast
(但注意可能的时区问题):
df%>%
mutate(date_parsed = sql(
CAST(CAST(unix_timestamp(date,' MM / dd / yyyy')AS timestamp)AS date)))
#来源:lazy query [?? x 2]
#数据库:spark_connection
日期date_parsed
< chr> < CHR>
1 01/15/2010 2010-01-15
编辑 strong>: 看起来这个问题已经在当前的master上解决了( I've brought a table into Hue which has a column of dates and i'm trying to play with it using sparklyr in Rstudio. I'd like to convert a character column into a date column like so: and this runs fine but when i check: How to I properly convert the chr to dates? Thanks!!!! The problem is that In Spark 2.2 or later use
Interestingly internal Spark object still uses
For earlier versions
Edit: It looks like this problem has been resolved on current master (
这篇关于使用sparklyr将字符串/字符转换为日期的文章就介绍到这了,希望我们推荐的答案对大家有所帮助,也希望大家多多支持IT屋!
sparklyr_0.7.0-9105
):
#来源:lazy query [?? x 2]
#数据库:spark_connection
日期date_parsed
< chr> <日期>
1 01/01/2010 2009-12-31
Weather_data = mutate(Weather_data, date2 = as.Date(date, "%m/%d/%Y"))
head(Weather_data)
sparklyr
doesn't correctly support Spark DateType
. It is possible to parse dates, and correct format, but not represent these as proper DateType
columns. If that's enough then please follow the instructions below.to_date
with Java SimpleDataFormat
compatible string:df <- copy_to(sc, data.frame(date=c("01/01/2010")))
parsed <- df %>% mutate(date_parsed = to_date(date, "MM/dd/yyyy"))
parsed
# Source: lazy query [?? x 2]
# Database: spark_connection
date date_parsed
<chr> <chr>
1 01/15/2010 2010-01-15
DateType
columns:parsed %>% spark_dataframe
<jobj[120]>
class org.apache.spark.sql.Dataset
[date: string, date_parsed: date]
unix_timestamp
and cast
(but watch for possible timezone problems):df %>%
mutate(date_parsed = sql(
"CAST(CAST(unix_timestamp(date, 'MM/dd/yyyy') AS timestamp) AS date)"))
# Source: lazy query [?? x 2]
# Database: spark_connection
date date_parsed
<chr> <chr>
1 01/15/2010 2010-01-15
sparklyr_0.7.0-9105
):# Source: lazy query [?? x 2]
# Database: spark_connection
date date_parsed
<chr> <date>
1 01/01/2010 2009-12-31