如何在人类可理解的PySpark时间戳中转换unix时间戳列? [英] How to convert a unix timestamp column in a human comprehensible timestamp in PySpark?

查看:84
本文介绍了如何在人类可理解的PySpark时间戳中转换unix时间戳列?的处理方法,对大家解决问题具有一定的参考价值,需要的朋友们下面随着小编来一起学习吧!

问题描述

我有一列包含unix-timestamp数据,例如,Spark将其解释为Long类型:

I have a column containing unix-timestamp data interpreted as Long type by Spark, for example:

+---------------+
| my_timestamp  | 
+---------------+
| 1584528257638 |
| 1586618807677 |
| 1585923477767 |
| 1583314882085 |

我想将其转换为人类可读的格式,例如具有类似的内容

I'd like to convert it into a human readable format and for example having something like

+------------------------+
|      my_timestamp      | 
+------------------------+
|2020-03-18 10:44:17.638 |
|2020-04-11 16:26:47.677 |
|2020-04-03 15:17:57.767 |
|2020-03-04 09:41:22.085 |

我该怎么做?

推荐答案

由于timestamp列以毫秒为单位,只需要将其转换为秒并将其转换为 TimestampType 即可,这应该可以解决问题:

As the timestamp column is in milliseconds is just necessary to convert into seconds and cast it into TimestampType and that should do the trick:

from pyspark.sql.types import TimestampType
import pyspark.sql.functions as F

df.select( 
      (F.col("my_timestamp") / 1000).cast(TimestampType())
)

这篇关于如何在人类可理解的PySpark时间戳中转换unix时间戳列?的文章就介绍到这了,希望我们推荐的答案对大家有所帮助,也希望大家多多支持IT屋!

查看全文
登录 关闭
扫码关注1秒登录
发送“验证码”获取 | 15天全站免登陆