Apache Zeppelin 0.6.1:运行 Spark 2.0 Twitter Stream 应用程序 [英] Apache Zeppelin 0.6.1: Run Spark 2.0 Twitter Stream App

查看:19
本文介绍了Apache Zeppelin 0.6.1:运行 Spark 2.0 Twitter Stream 应用程序的处理方法,对大家解决问题具有一定的参考价值,需要的朋友们下面随着小编来一起学习吧!

问题描述

我有一个安装了 Spark 2.0 和 Zeppelin 0.6.1 的集群.由于 TwitterUtils.scala 类已从 Spark 项目移至 Apache Bahir,我无法再在我的 Zeppelin 笔记本中使用 TwitterUtils.

这是我的笔记本的片段:

依赖加载:

%depz.resetz.load("org.apache.bahir:spark-streaming-twitter_2.11:2.0.0")DepInterpreter(%dep) 已弃用.而是通过 GUI 解释器菜单删除依赖项和存储库.DepInterpreter(%dep) 已弃用.而是通过 GUI 解释器菜单加载依赖项.res1:org.apache.zeppelin.dep.Dependency = org.apache.zeppelin.dep.Dependency@4793109a

还有火花部分:

import org.apache.spark.streaming.twitter导入 org.apache.spark.streaming._导入 org.apache.spark.storage.StorageLevel导入 scala.io.Source导入 scala.collection.mutable.HashMap导入 java.io.File导入 org.apache.log4j.Logger导入 org.apache.log4j.Level导入 sys.process.stringSeqToProcess导入 org.apache.spark.SparkConf//********************************* 配置用于访问 Twitter 的 Oauth 凭据 ****************************def configureTwitterCredentials(apiKey: String, apiSecret: String, accessToken: String, accessTokenSecret: String) {...}//***************************************** 配置 Twitter 凭据 ************************************************val apiKey = ...val apiSecret = ...val accessToken = ...val accessTokenSecret = ...配置TwitterCredentials(apiKey, apiSecret, accessToken, accessTokenSecret)//***************************************************** 逻辑本身*****************************************************val ssc = new StreamingContext(sc, Seconds(2))val tweets = TwitterUtils.createStream(ssc, None)val twt = tweets.window(Seconds(60))

当我在导入依赖后尝试运行 notebook 的 Spark 部分时,出现以下异常:

:44: 错误:对象 twitter 不是包 org.apache.spark.streaming 的成员导入 org.apache.spark.streaming.twitter

我在这里做错了什么?Bahir 文档还使用了 import org.apache.spark.streaming.twitter._ 命令,请参阅

I have a cluster with Spark 2.0 and Zeppelin 0.6.1 installed. Since the class TwitterUtils.scala is moved from Spark project to Apache Bahir, I can't use the TwitterUtils in my Zeppelin notebook anymore.

Here the snippets of my notebook:

Dependency loading:

%dep
z.reset
z.load("org.apache.bahir:spark-streaming-twitter_2.11:2.0.0")

DepInterpreter(%dep) deprecated. Remove dependencies and repositories through GUI interpreter menu instead.
DepInterpreter(%dep) deprecated. Load dependency through GUI interpreter menu instead.
res1: org.apache.zeppelin.dep.Dependency = org.apache.zeppelin.dep.Dependency@4793109a

And the Spark part:

import org.apache.spark.streaming.twitter
import org.apache.spark.streaming._
import org.apache.spark.storage.StorageLevel
import scala.io.Source
import scala.collection.mutable.HashMap
import java.io.File
import org.apache.log4j.Logger
import org.apache.log4j.Level
import sys.process.stringSeqToProcess
import org.apache.spark.SparkConf

// ********************************* Configures the Oauth Credentials for accessing Twitter ****************************
def configureTwitterCredentials(apiKey: String, apiSecret: String, accessToken: String, accessTokenSecret: String) {...}

// ***************************************** Configure Twitter credentials ********************************************
val apiKey = ...
val apiSecret = ...
val accessToken = ...
val accessTokenSecret = ...
configureTwitterCredentials(apiKey, apiSecret, accessToken, accessTokenSecret)

//  ************************************************* The logic itself *************************************************
val ssc = new StreamingContext(sc, Seconds(2))
val tweets = TwitterUtils.createStream(ssc, None)
val twt = tweets.window(Seconds(60))

When I try to run the Spark part of the notebook after importing the dependency, I get the following exception:

<console>:44: error: object twitter is not a member of package org.apache.spark.streaming
   import org.apache.spark.streaming.twitter

What am I doing wrong here? Bahir documentation also uses the import org.apache.spark.streaming.twitter._ command, see http://bahir.apache.org/docs/spark/2.0.0/spark-streaming-twitter/

解决方案

Well, dep is not exactly stable and since it is deprecated anyway why not use supported methods? If you don't won't to modify neither Spark nor Zeppelin configuration files you can add dependencies to the interpreter configuration (I omitted properties for clarity):

这篇关于Apache Zeppelin 0.6.1:运行 Spark 2.0 Twitter Stream 应用程序的文章就介绍到这了,希望我们推荐的答案对大家有所帮助,也希望大家多多支持IT屋!

查看全文
登录 关闭
扫码关注1秒登录
发送“验证码”获取 | 15天全站免登陆