引起原因:org.bson.BsonInvalidOperationException:无效状态INITIAL [英] Caused by: org.bson.BsonInvalidOperationException: Invalid state INITIAL

查看:81
本文介绍了引起原因:org.bson.BsonInvalidOperationException:无效状态INITIAL的处理方法,对大家解决问题具有一定的参考价值,需要的朋友们下面随着小编来一起学习吧!

问题描述

互联网上也有几个类似的问题,但没人能回答.

There are several similar questions over there on the internet,but no one has answers.

我正在使用以下代码将mongo数据保存到Hive,但最终会发生异常.我会问如何解决这个问题

I am using following code to save the mongo data to Hive, but exceptions occur as shown in the end. I would ask how to work around this problem

我正在使用

  • spark-mongo-connector(spark 2.1.0-scala 2.11)

  • spark-mongo-connector (spark 2.1.0 - scala 2.11)

java-mongo-driver 3.10.2

java-mongo-driver 3.10.2

import com.mongodb.spark.MongoSpark
import org.apache.spark.SparkConf
import org.apache.spark.sql.SparkSession
import org.apache.spark.sql.types.StructType

object MongoConnector_Test {
  def main(args: Array[String]): Unit = {
    val conf = new SparkConf().set("spark.mongodb.input.uri", "mongodb://user:pass@mongo1:123456/db1.t1").setMaster("local[4]").setAppName("MongoConnectorTest")
    val session = SparkSession.builder().config(conf).enableHiveSupport().getOrCreate()
    val schema: StructType = new StructType().add("_id", "string").add("x", "string").add("y", "string").add("z", "string")//
    val df = MongoSpark.read(session).schema(schema).load()
    df.write.saveAsTable("MongoConnector_Test" + System.currentTimeMillis())
  }

}

但是,发生以下异常.

Caused by: org.bson.BsonInvalidOperationException: Invalid state INITIAL
    at org.bson.json.StrictCharacterStreamJsonWriter.checkState(StrictCharacterStreamJsonWriter.java:395)
    at org.bson.json.StrictCharacterStreamJsonWriter.writeNull(StrictCharacterStreamJsonWriter.java:192)
    at org.bson.json.JsonNullConverter.convert(JsonNullConverter.java:24)
    at org.bson.json.JsonNullConverter.convert(JsonNullConverter.java:21)
    at org.bson.json.JsonWriter.doWriteNull(JsonWriter.java:206)
    at org.bson.AbstractBsonWriter.writeNull(AbstractBsonWriter.java:557)
    at org.bson.codecs.BsonNullCodec.encode(BsonNullCodec.java:38)
    at org.bson.codecs.BsonNullCodec.encode(BsonNullCodec.java:28)
    at org.bson.codecs.EncoderContext.encodeWithChildContext(EncoderContext.java:91)
    at org.bson.codecs.BsonValueCodec.encode(BsonValueCodec.java:62)
    at com.mongodb.spark.sql.BsonValueToJson$.apply(BsonValueToJson.scala:29)
    at com.mongodb.spark.sql.MapFunctions$.bsonValueToString(MapFunctions.scala:103)
    at com.mongodb.spark.sql.MapFunctions$.com$mongodb$spark$sql$MapFunctions$$convertToDataType(MapFunctions.scala:78)
    at com.mongodb.spark.sql.MapFunctions$$anonfun$3.apply(MapFunctions.scala:39)
    at com.mongodb.spark.sql.MapFunctions$$anonfun$3.apply(MapFunctions.scala:37)
    at scala.collection.TraversableLike$$anonfun$map$1.apply(TraversableLike.scala:234)
    at scala.collection.TraversableLike$$anonfun$map$1.apply(TraversableLike.scala:234)
    at scala.collection.IndexedSeqOptimized$class.foreach(IndexedSeqOptimized.scala:33)
    at scala.collection.mutable.ArrayOps$ofRef.foreach(ArrayOps.scala:186)
    at scala.collection.TraversableLike$class.map(TraversableLike.scala:234)
    at scala.collection.mutable.ArrayOps$ofRef.map(ArrayOps.scala:186)
    at com.mongodb.spark.sql.MapFunctions$.documentToRow(MapFunctions.scala:37)
    at com.mongodb.spark.sql.MongoRelation$$anonfun$buildScan$2.apply(MongoRelation.scala:45)
    at com.mongodb.spark.sql.MongoRelation$$anonfun$buildScan$2.apply(MongoRelation.scala:45)
    at scala.collection.Iterator$$anon$11.next(Iterator.scala:409)
    at scala.collection.Iterator$$anon$11.next(Iterator.scala:409)
    at org.apache.spark.sql.catalyst.expressions.GeneratedClass$GeneratedIterator.processNext(Unknown Source)
    at org.apache.spark.sql.execution.BufferedRowIterator.hasNext(BufferedRowIterator.java:43)
    at org.apache.spark.sql.execution.WholeStageCodegenExec$$anonfun$8$$anon$1.hasNext(WholeStageCodegenExec.scala:377)
    at org.apache.spark.sql.execution.datasources.FileFormatWriter$SingleDirectoryWriteTask.execute(FileFormatWriter.scala:243)
    at org.apache.spark.sql.execution.datasources.FileFormatWriter$$anonfun$org$apache$spark$sql$execution$datasources$FileFormatWriter$$executeTask$3.apply(FileFormatWriter.scala:190)
    at org.apache.spark.sql.execution.datasources.FileFormatWriter$$anonfun$org$apache$spark$sql$execution$datasources$FileFormatWriter$$executeTask$3.apply(FileFormatWriter.scala:188)
    at org.apache.spark.util.Utils$.tryWithSafeFinallyAndFailureCallbacks(Utils.scala:1341)
    at org.apache.spark.sql.execution.datasources.FileFormatWriter$.org$apache$spark$sql$execution$datasources$FileFormatWriter$$executeTask(FileFormatWriter.scala:193)
    ... 8 more

推荐答案

如果您仍然遇到此问题.假设您在mongo集合中有2个文档.这将产生错误,因为structType otherDetails字段(isExternalUser,isPrivate)内的子元素具有布尔值和字符串.因此,应将两者都更改为String或boolean才能正常工作.同时,它可能在所有收款文档中都包含或不包含某些字段(此处isInternal不在第二位).

Incase you still have this issue. Consider you have 2 documents in the mongo collecttion. This will give error since the sub elements inside structType otherDetails field (isExternalUser, isPrivate) has boolean and String. So both should be changed to String or boolean to make it work. At the same time it may or may not have some field in all collection documents(here isInternal is not present in second).

{
"_id" : ObjectId("5aa78d90d169ed325063b06d"),
"Name" : Kailash Test,
"EmpId" : 1234567,
"company" : "test.com",
"otherDetails" : {
    "isPrivate" : false,
    "isInternal" : false,
    "isExternalUser" : true
},
}
{
"_id" : ObjectId("5aa78d90d169ed123456789d"),
"Name" : Kailash Test2,
"EmpId" : 1234567,
"company" : "test.com",
"otherDetails" : {
    "isPrivate" : "false",
    "isExternalUser" : "true"
},
}

这篇关于引起原因:org.bson.BsonInvalidOperationException:无效状态INITIAL的文章就介绍到这了,希望我们推荐的答案对大家有所帮助,也希望大家多多支持IT屋!

查看全文
相关文章
登录 关闭
扫码关注1秒登录
发送“验证码”获取 | 15天全站免登陆