Apache Avro作为Apache Spark 2.4中的内置数据源 [英] Apache Avro as a Built-in Data Source in Apache Spark 2.4
问题描述
I recently read this article and tried out the example but when I run
val usersDF = spark.read.format("avro")
.load("examples/src/main/resources/users.avro")
但是当我尝试运行它时,这给了我一个错误.
But this gives me an error when I try to run it.
线程主要" org.apache.spark.sql.AnalysisException中的异常:无法找到数据源:avro.Avro是内置的,但外部数据自Spark 2.4起的源模块.请按照以下说明部署应用程序"Apache Avro数据源指南"的部署部分.在org.apache.spark.sql.execution.datasources.DataSource $ .lookupDataSource(DataSource.scala:647)
Exception in thread "main" org.apache.spark.sql.AnalysisException: Failed to find data source: avro. Avro is built-in but external data source module since Spark 2.4. Please deploy the application as per the deployment section of "Apache Avro Data Source Guide".; at org.apache.spark.sql.execution.datasources.DataSource$.lookupDataSource(DataSource.scala:647)
推荐答案
阅读 Apache Avro数据源指南,我认为需要使用新的依赖项来更新build.sbt.
Upon reading up Apache Avro Data Source Guide, I figured build.sbt needs to be updated with a new dependency.
val sparkVersion = "2.4.0"
"org.apache.spark" %% "spark-avro" % sparkVersion
此后一切正常.
这篇关于Apache Avro作为Apache Spark 2.4中的内置数据源的文章就介绍到这了,希望我们推荐的答案对大家有所帮助,也希望大家多多支持IT屋!