写入CSV文件Spark时时间戳更改格式 [英] Timestamp changes format when writing to csv file spark

查看:151
本文介绍了写入CSV文件Spark时时间戳更改格式的处理方法,对大家解决问题具有一定的参考价值,需要的朋友们下面随着小编来一起学习吧!

问题描述

我正在尝试将数据帧保存到包含时间戳的csv文件中.

I am trying to save a dataframe to a csv file, that contains a timestamp.

此列更改为在csv文件中写入的格式一的问题.这是我使用的代码:

The problem that this column changes of format one written in the csv file. Here is the code I used:

    val spark = SparkSession.builder.master("local").appName("my-spark-app").getOrCreate()
    
    val df = spark.read.option("header",true).option("inferSchema", "true").csv("C:/Users/mhattabi/Desktop/dataTest2.csv")
    //val df = spark.read.option("header",true).option("inferSchema", "true").csv("C:\\dataSet.csv\\datasetTest.csv")
    //convert all column to numeric value in order to apply aggregation function 
    df.columns.map { c  =>df.withColumn(c, col(c).cast("int")) }
    //add a new column inluding the new timestamp column
    val result2=df.withColumn("new_time",((unix_timestamp(col("time"))/300).cast("long") * 300).cast("timestamp")).drop("time")
    val finalresult=result2.groupBy("new_time").agg(result2.drop("new_time").columns.map((_ -> "mean")).toMap).sort("new_time") //agg(avg(all columns..)
   finalresult.coalesce(1).write.option("header",true).option("inferSchema","true").csv("C:/mydata.csv")

通过df显示时,请显示正确的格式

when display via df.show it shoes the correct format

但是在csv文件中,它采用以下格式:

But in the csv file it shoes this format:

推荐答案

使用选项将时间戳格式化为所需的时间戳:

Use option to format timestamp into desired one which you need:

finalresult.coalesce(1).write.option("header",true).option("inferSchema","true").option("dateFormat", "yyyy-MM-dd HH:mm:ss").csv("C:/mydata.csv")

finalresult.coalesce(1).write.format("csv").option("delimiter", "\t").option("header",true).option("inferSchema","true").option("dateFormat", "yyyy-MM-dd HH:mm:ss").option("escape", "\\").save("C:/mydata.csv")

这篇关于写入CSV文件Spark时时间戳更改格式的文章就介绍到这了,希望我们推荐的答案对大家有所帮助,也希望大家多多支持IT屋!

查看全文
登录 关闭
扫码关注1秒登录
发送“验证码”获取 | 15天全站免登陆