Spark 2.2非法模式组件:XXX java.lang.IllegalArgumentException:非法模式组件:XXX [英] Spark 2.2 Illegal pattern component: XXX java.lang.IllegalArgumentException: Illegal pattern component: XXX
问题描述
我正在尝试从Spark 2.1升级到2.2.当我尝试将数据框读取或写入位置(CSV或JSON)时,出现此错误:
I'm trying to upgrade from Spark 2.1 to 2.2. When I try to read or write a dataframe to a location (CSV or JSON) I am receiving this error:
Illegal pattern component: XXX
java.lang.IllegalArgumentException: Illegal pattern component: XXX
at org.apache.commons.lang3.time.FastDatePrinter.parsePattern(FastDatePrinter.java:282)
at org.apache.commons.lang3.time.FastDatePrinter.init(FastDatePrinter.java:149)
at org.apache.commons.lang3.time.FastDatePrinter.<init>(FastDatePrinter.java:142)
at org.apache.commons.lang3.time.FastDateFormat.<init>(FastDateFormat.java:384)
at org.apache.commons.lang3.time.FastDateFormat.<init>(FastDateFormat.java:369)
at org.apache.commons.lang3.time.FastDateFormat$1.createInstance(FastDateFormat.java:91)
at org.apache.commons.lang3.time.FastDateFormat$1.createInstance(FastDateFormat.java:88)
at org.apache.commons.lang3.time.FormatCache.getInstance(FormatCache.java:82)
at org.apache.commons.lang3.time.FastDateFormat.getInstance(FastDateFormat.java:165)
at org.apache.spark.sql.catalyst.json.JSONOptions.<init>(JSONOptions.scala:81)
at org.apache.spark.sql.catalyst.json.JSONOptions.<init>(JSONOptions.scala:43)
at org.apache.spark.sql.execution.datasources.json.JsonFileFormat.inferSchema(JsonFileFormat.scala:53)
at org.apache.spark.sql.execution.datasources.DataSource$$anonfun$7.apply(DataSource.scala:177)
at org.apache.spark.sql.execution.datasources.DataSource$$anonfun$7.apply(DataSource.scala:177)
at scala.Option.orElse(Option.scala:289)
at org.apache.spark.sql.execution.datasources.DataSource.getOrInferFileFormatSchema(DataSource.scala:176)
at org.apache.spark.sql.execution.datasources.DataSource.resolveRelation(DataSource.scala:366)
at org.apache.spark.sql.DataFrameReader.load(DataFrameReader.scala:178)
at org.apache.spark.sql.DataFrameReader.json(DataFrameReader.scala:333)
at org.apache.spark.sql.DataFrameReader.json(DataFrameReader.scala:279)
我没有为dateFormat设置默认值,所以我不了解它的来源.
I am not setting a default value for dateFormat, so I'm not understanding where it is coming from.
spark.createDataFrame(objects.map((o) => MyObject(t.source, t.table, o.partition, o.offset, d)))
.coalesce(1)
.write
.mode(SaveMode.Append)
.partitionBy("source", "table")
.json(path)
我仍然收到此错误:
import org.apache.spark.sql.{SaveMode, SparkSession}
val spark = SparkSession.builder.appName("Spark2.2Test").master("local").getOrCreate()
import spark.implicits._
val agesRows = List(Person("alice", 35), Person("bob", 10), Person("jill", 24))
val df = spark.createDataFrame(agesRows).toDF();
df.printSchema
df.show
df.write.mode(SaveMode.Overwrite).csv("my.csv")
以下是架构: 根 |-名称:字符串(nullable = true) |-年龄:长(nullable = false)
Here is the schema: root |-- name: string (nullable = true) |-- age: long (nullable = false)
推荐答案
我找到了答案.
timestampFormat的默认值为yyyy-MM-dd'T'HH:mm:ss.SSSXXX
,这是一个非法参数.写数据框时需要设置它.
The default for the timestampFormat is yyyy-MM-dd'T'HH:mm:ss.SSSXXX
which is an illegal argument. It needs to be set when you are writing the dataframe out.
解决方法是将其更改为ZZ,其中将包含时区.
The fix is to change that to ZZ which will include the timezone.
df.write
.option("timestampFormat", "yyyy/MM/dd HH:mm:ss ZZ")
.mode(SaveMode.Overwrite)
.csv("my.csv")
这篇关于Spark 2.2非法模式组件:XXX java.lang.IllegalArgumentException:非法模式组件:XXX的文章就介绍到这了,希望我们推荐的答案对大家有所帮助,也希望大家多多支持IT屋!