Apache Avro作为Apache Spark 2.4中的内置数据源 [英] Apache Avro as a Built-in Data Source in Apache Spark 2.4

查看:178
本文介绍了Apache Avro作为Apache Spark 2.4中的内置数据源的处理方法,对大家解决问题具有一定的参考价值,需要的朋友们下面随着小编来一起学习吧!

问题描述

我最近阅读了

I recently read this article and tried out the example but when I run

val usersDF = spark.read.format("avro")
                        .load("examples/src/main/resources/users.avro")

但是当我尝试运行它时,这给了我一个错误.

But this gives me an error when I try to run it.

线程主要" org.apache.spark.sql.AnalysisException中的异常:无法找到数据源:avro.Avro是内置的,但外部数据自Spark 2.4起的源模块.请按照以下说明部署应用程序"Apache Avro数据源指南"的部署部分.在org.apache.spark.sql.execution.datasources.DataSource $ .lookupDataSource(DataSource.scala:647)

Exception in thread "main" org.apache.spark.sql.AnalysisException: Failed to find data source: avro. Avro is built-in but external data source module since Spark 2.4. Please deploy the application as per the deployment section of "Apache Avro Data Source Guide".; at org.apache.spark.sql.execution.datasources.DataSource$.lookupDataSource(DataSource.scala:647)

推荐答案

阅读 Apache Avro数据源指南,我认为需要使用新的依赖项来更新build.sbt.

Upon reading up Apache Avro Data Source Guide, I figured build.sbt needs to be updated with a new dependency.

val sparkVersion = "2.4.0"
"org.apache.spark" %% "spark-avro" % sparkVersion

此后一切正常.

这篇关于Apache Avro作为Apache Spark 2.4中的内置数据源的文章就介绍到这了,希望我们推荐的答案对大家有所帮助,也希望大家多多支持IT屋!

查看全文
登录 关闭
扫码关注1秒登录
发送“验证码”获取 | 15天全站免登陆