如何解决 com.mongodb.spark.exceptions.MongoTypeConversionException:无法转换... Java Spark [英] How to resolve com.mongodb.spark.exceptions.MongoTypeConversionException: Cannot cast... Java Spark

查看：26 发布时间：2021/11/14 22:01:17 java mongodb hive apache-spark-sql

本文介绍了如何解决 com.mongodb.spark.exceptions.MongoTypeConversionException:无法转换... Java Spark的处理方法，对大家解决问题具有一定的参考价值，需要的朋友们下面随着小编来一起学习吧！

问题描述

您好，我是 Java Spark 的新手，几天来一直在寻找解决方案.

Hi I am new to Java Spark, and have been looking for solutions for couple of days.

我正在将 MongoDB 数据加载到 hive 表中，但是，当 saveAsTable 发生此错误时，我发现了一些错误

I am working on loading MongoDB data into hive table, however, I found some error while saveAsTable that occurs this error

com.mongodb.spark.exceptions.MongoTypeConversionException: Cannot cast STRING into a StructType(StructField(oid,StringType,true)) (value: BsonString{value='54d3e8aeda556106feba7fa2'})

我试过增加 sampleSize、不同的 mongo-spark-connector 版本，...但没有可行的解决方案.

I've tried increase the sampleSize, different mongo-spark-connector versions, ... but non of working solutions.

我无法弄清楚根本原因是什么以及需要完成的工作之间有哪些差距?

I can't figure out what is the root cause and what are the gaps in between that needs to be done?

最令人困惑的部分是我有使用相同流程的相似数据集，没有问题.

The most confusing part is I have similar sets of data using the same flow without issue.

mongodb 数据模式就像嵌套结构和数组

the mongodb data schema is like nested struct and array

root
 |-- sample: struct (nullable = true)
 |    |-- parent: struct (nullable = true)
 |    |    |-- expanded: array (nullable = true)
 |    |    |    |-- element: struct (containsNull = true)
 |    |    |    |    |-- distance: integer (nullable = true)
 |    |    |    |    |-- id: struct (nullable = true)
 |    |    |    |    |    |-- oid: string (nullable = true)
 |    |    |    |    |-- keys: array (nullable = true)
 |    |    |    |    |    |-- element: string (containsNull = true)
 |    |    |    |    |-- name: string (nullable = true)
 |    |    |    |    |-- parent_id: array (nullable = true)
 |    |    |    |    |    |-- element: struct (containsNull = true)
 |    |    |    |    |    |    |-- oid: string (nullable = true)
 |    |    |    |    |-- type: string (nullable = true)
 |    |    |-- id: array (nullable = true)
 |    |    |    |-- element: struct (containsNull = true)
 |    |    |    |    |-- oid: string (nullable = true)

样本数据

    "sample": {
      "expanded": [
        {
          "distance": 0,
          "type": "domain",
          "id": "54d3e17b5cf737074d4065b0",
          "parent_id": [
            "54d3e1775cf737074d406599"
          ],
          "name": "level2"
        },
        {
          "distance": 1,
          "type": "domain",
          "id": "54d3e1775cf737074d406599",
          "name": "level1"
        }
      ],
      "id": [
        "54d3e17b5cf737074d4065b0"
      ]
    }

示例代码

public static void main(final String[] args) throws InterruptedException {
    // spark session read mongodb
    SparkSession mongo_spark = SparkSession.builder()
            .master("local")
            .appName("MongoSparkConnectorIntro")
            .config("mongo_spark.master", "local")
            .config("spark.mongodb.input.uri", "mongodb://localhost:27017/test_db.test_collection")
            .enableHiveSupport()
            .getOrCreate();

    // Create a JavaSparkContext using the SparkSession's SparkContext object
    JavaSparkContext jsc = new JavaSparkContext(mongo_spark.sparkContext());

    // Load data and infer schema, disregard toDF() name as it returns Dataset
    Dataset<Row> implicitDS = MongoSpark.load(jsc).toDF();
    implicitDS.printSchema();
    implicitDS.show();

    // createOrReplaceTempView to see if the data being read
    // implicitDS.createOrReplaceTempView("my_table");
    // implicitDS.printSchema();
    // implicitDS.show();

    // saveAsTable
    implicitDS.write().saveAsTable("my_table");
    mongo_spark.sql("SELECT * FROM my_table limit 1").show();

    mongo_spark.stop();
}

如果有人有一些想法，我将非常感激.谢谢

If anyone have some thoughts I would be very much appreciate. Thanks

如何解决 com.mongodb.spark.exceptions.MongoTypeConversionException:无法转换... Java Spark [英] How to resolve com.mongodb.spark.exceptions.MongoTypeConversionException: Cannot cast... Java Spark

问题描述

推荐答案

相关文章

Java开发最新文章

热门教程

热门工具

登录关闭

如何解决 com.mongodb.spark.exceptions.MongoTypeConversionException:无法转换... Java Spark [英] How to resolve com.mongodb.spark.exceptions.MongoTypeConversionException: Cannot cast... Java Spark

问题描述

推荐答案

相关文章

Java开发最新文章

热门教程

热门工具

登录 关闭

登录关闭