如何强制CSV的inferSchema将整数视为日期(使用"dateFormat"选项)? [英] How to force inferSchema for CSV to consider integers as dates (with "dateFormat" option)?

查看：176 发布时间：2020/9/4 0:51:10 apache-spark dataframe apache-spark-sql spark-csv

本文介绍了如何强制CSV的inferSchema将整数视为日期(使用"dateFormat"选项)?的处理方法，对大家解决问题具有一定的参考价值，需要的朋友们下面随着小编来一起学习吧！

问题描述

我使用Spark 2.2.0

I use Spark 2.2.0

我正在读取csv文件，如下所示:

I am reading a csv file as follows:

val dataFrame = spark.read.option("inferSchema", "true")
                          .option("header", true)
                          .option("dateFormat", "yyyyMMdd")
                          .csv(pathToCSVFile)

此文件中有一个日期列，并且该特定列的所有记录的值均等于20171001.

There is one date column in this file, and all records has a value equal to 20171001 for this particular column.

问题在于，spark推断此列的类型为integer而不是date.当我删除"inferSchema"选项时，该列的类型为string.

The issue is that spark is inferring that that the type of this column is integer rather than date. When I remove the "inferSchema" option, the type of that column is string.

此文件中没有null值，也没有格式错误的行.

There is no null values, nor any wrongly formatted line in this file.

此问题的原因/解决方案是什么?

What is the reason/solution for this issue?

如何强制CSV的inferSchema将整数视为日期(使用"dateFormat"选项)? [英] How to force inferSchema for CSV to consider integers as dates (with "dateFormat" option)?

问题描述

推荐答案

相关文章

其他开发最新文章

热门教程

热门工具

登录关闭

如何强制CSV的inferSchema将整数视为日期(使用"dateFormat"选项)? [英] How to force inferSchema for CSV to consider integers as dates (with &quot;dateFormat&quot; option)?

问题描述

推荐答案

相关文章

其他开发最新文章

热门教程

热门工具

登录 关闭

如何强制CSV的inferSchema将整数视为日期(使用"dateFormat"选项)? [英] How to force inferSchema for CSV to consider integers as dates (with "dateFormat" option)?

登录关闭