如何在DataFrames中将列类型从String更改为Date? [英] How to change the column type from String to Date in DataFrames?

查看:163
本文介绍了如何在DataFrames中将列类型从String更改为Date?的处理方法,对大家解决问题具有一定的参考价值,需要的朋友们下面随着小编来一起学习吧!

问题描述

我有一个数据帧,其中有两列(C,D)被定义为字符串列类型,但这些列中的数据实际上是日期.例如,C列的日期为"01-APR-2015",D列的日期为"20150401",我想将它们更改为date列类型,但是我没有找到一种很好的方法.我看一下堆栈溢出,我需要在Spark SQL的DataFrame中将字符串列类型转换为Date列类型.日期格式可以是"2015年4月1日",我查看的是

I have a dataframe that have two columns (C, D) are defined as string column type, but the data in the columns are actually dates. for example column C has the date as "01-APR-2015" and column D as "20150401" I want to change these to date column type, but I didn't find a good way of doing that. I look at the stack overflow I need to convert the string column type to Date column type in Spark SQL's DataFrame. the date format can be "01-APR-2015" and I look at this post but it didn't have info relate to date

推荐答案

火花> = 2.2

您可以使用to_date:

import org.apache.spark.sql.functions.{to_date, to_timestamp}

df.select(to_date($"ts", "dd-MMM-yyyy").alias("date"))

to_timestamp:

df.select(to_date($"ts", "dd-MMM-yyyy").alias("timestamp"))

带有中间unix_timestamp调用.

火花< 2.2

从Spark 1.5开始,您可以使用unix_timestamp函数将字符串解析为long,将其转换为时间戳并截断to_date:

Since Spark 1.5 you can use unix_timestamp function to parse string to long, cast it to timestamp and truncate to_date:

import org.apache.spark.sql.functions.{unix_timestamp, to_date}

val df = Seq((1L, "01-APR-2015")).toDF("id", "ts")

df.select(to_date(unix_timestamp(
  $"ts", "dd-MMM-yyyy"
).cast("timestamp")).alias("timestamp"))

注意:

根据Spark版本,您可能需要进行一些调整,原因是 SPARK-11724 :

Depending on a Spark version you this may require some adjustments due to SPARK-11724:

从整数类型到时间戳的转换会将源int视为以毫秒为单位.从时间戳转换为整数类型会以秒为单位创建结果.

Casting from integer types to timestamp treats the source int as being in millis. Casting from timestamp to integer types creates the result in seconds.

如果您使用未修补的版本,则unix_timestamp输出需要乘以1000.

If you use unpatched version unix_timestamp output requires multiplication by 1000.

这篇关于如何在DataFrames中将列类型从String更改为Date?的文章就介绍到这了,希望我们推荐的答案对大家有所帮助,也希望大家多多支持IT屋!

查看全文
相关文章
登录 关闭
扫码关注1秒登录
发送“验证码”获取 | 15天全站免登陆