如何使用星火外壳TwitterUtils? [英] How to use TwitterUtils in Spark shell?

查看:450
本文介绍了如何使用星火外壳TwitterUtils?的处理方法,对大家解决问题具有一定的参考价值,需要的朋友们下面随着小编来一起学习吧!

问题描述

我试图用twitterUtils星火壳牌(如果他们不是默认情况下可用)。

我添加以下内容 spark-env.sh

<$p$p><$c$c>SPARK_CLASSPATH=\"/disk.b/spark-master-2014-07-28/external/twitter/target/spark-streaming-twitter_2.10-1.1.0-SNAPSHOT.jar\"

我现在可以执行

 进口org.apache.spark.streaming.twitter._
进口org.apache.spark.streaming.StreamingContext._

而不会在壳,没有添加的罐子类路径这将是不可能的错误(错误:对象twitter上是不包org.apache.spark.streaming的成员)。
不过,我会在Spark外壳执行这个时候得到一个错误:

 斯卡拉&GT; VAL SSC =新的StreamingContext(SC,秒(1))
SSC:org.apache.spark.streaming.StreamingContext =
org.apache.spark.streaming.StreamingContext@6e78177b斯卡拉&GT; VAL鸣叫= TwitterUtils.createStream(SSCtwitter.txt)
错误:坏的符号引用。在TwitterUtils.class签名是指
长期twitter4j封装&LT;根和GT;这是不可用的。
它可以从当前的classpath被完全丢失,或者编译时类路径的版本可能与所使用的版本不兼容
TwitterUtils.class。

我在想什么?我一定要导入另一个罐子?


解决方案

是的,你需要的 Twitter4J 的JAR,除了火花流,叽叽喳喳一个你已经有了。具体来说,<一个href=\"http://apache-spark-user-list.1001560.n3.nabble.com/What-version-of-twitter4j-should-I-use-with-Spark-Streaming-td9352.html\">the星火开发者建议使用Twitter4J版本3.0.3 。

在你下载正确的JAR文件,你会希望通过将它们传递给外壳 - 罐子标记。我想你也可以通过 SPARK_CLASSPATH 做到这一点,你已经做到了。

下面是我是如何做到的火花EC2集群上:

 #!/斌/庆典
CD /根/火花/ lib目录
MKDIR twitter4j#获取星火流JAR。
卷曲-O \"http://search.maven.org/remotecontent?filepath=org/apache/spark/spark-streaming-twitter_2.10/1.0.0/spark-streaming-twitter_2.10-1.0.0.jar\"#获取Twitter4J JAR文件。退房http://twitter4j.org/archive/的其他版本。
TWITTER4J_SOURCE = twitter4j-3.0.3.zip
卷曲-Ohttp://twitter4j.org/archive/$TWITTER4J_SOURCE
解压-j ./$TWITTER4J_SOURCELIB / *。罐-d twitter4j /
RM $ TWITTER4J_SOURCE光盘
#点壳这些JAR去!
TWITTER4J_JARS =`LS -m /root/spark/lib/twitter4j/*.jar | TR -d'\\ n'`
/根/火花/斌/火花壳--jars /root/spark/lib/spark-streaming-twitter_2.10-1.0.0.jar,$TWITTER4J_JARS

I'm trying to use the twitterUtils in the Spark Shell (where they are not available by default).

I've added the following to spark-env.sh:

SPARK_CLASSPATH="/disk.b/spark-master-2014-07-28/external/twitter/target/spark-streaming-twitter_2.10-1.1.0-SNAPSHOT.jar"

I can now execute

import org.apache.spark.streaming.twitter._
import org.apache.spark.streaming.StreamingContext._

without an error in the shell, which would not be possible without added the jar to the classpath ("error: object twitter is not a member of package org.apache.spark.streaming"). However, I will get an error when executing this in the Spark shell:

scala> val ssc = new StreamingContext(sc, Seconds(1))
ssc: org.apache.spark.streaming.StreamingContext =
org.apache.spark.streaming.StreamingContext@6e78177b

scala> val tweets = TwitterUtils.createStream(ssc, "twitter.txt")
error: bad symbolic reference. A signature in TwitterUtils.class refers to
term twitter4j in package <root> which is not available.
It may be completely missing from the current classpath, or the version on the classpath might be incompatible with the version used when compiling
TwitterUtils.class.

What am I missing? Do I have to import another jar?

解决方案

Yep, you need the Twitter4J JARs in addition to the spark-streaming-twitter one you already have. Specifically, the Spark devs suggest using Twitter4J version 3.0.3.

After you download the correct JARs, you'll want to pass them to the shell via the --jars flag. I think you can also do this via SPARK_CLASSPATH as you've done.

Here's how I did it on a Spark EC2 cluster:

#!/bin/bash
cd /root/spark/lib
mkdir twitter4j

# Get the Spark Streaming JAR.
curl -O "http://search.maven.org/remotecontent?filepath=org/apache/spark/spark-streaming-twitter_2.10/1.0.0/spark-streaming-twitter_2.10-1.0.0.jar"

# Get the Twitter4J JARs. Check out http://twitter4j.org/archive/ for other versions.
TWITTER4J_SOURCE=twitter4j-3.0.3.zip
curl -O "http://twitter4j.org/archive/$TWITTER4J_SOURCE"
unzip -j ./$TWITTER4J_SOURCE "lib/*.jar" -d twitter4j/
rm $TWITTER4J_SOURCE

cd
# Point the shell to these JARs and go!
TWITTER4J_JARS=`ls -m /root/spark/lib/twitter4j/*.jar | tr -d '\n'`
/root/spark/bin/spark-shell --jars /root/spark/lib/spark-streaming-twitter_2.10-1.0.0.jar,$TWITTER4J_JARS

这篇关于如何使用星火外壳TwitterUtils?的文章就介绍到这了,希望我们推荐的答案对大家有所帮助,也希望大家多多支持IT屋!

查看全文
登录 关闭
扫码关注1秒登录
发送“验证码”获取 | 15天全站免登陆