无法通过水槽从twitter下载数据 [英] unable to download data from twitter through flume

查看:138
本文介绍了无法通过水槽从twitter下载数据的处理方法,对大家解决问题具有一定的参考价值,需要的朋友们下面随着小编来一起学习吧!

问题描述

  bin / flume-ng agent -n TwitterAgent --conf ./conf/ -f conf / flume-twitter.conf -Dflume.root.logger = DEBUG,console 

当我运行上面的命令时,它会产生以下错误:

  2016-05-06 13:33:31,357(Twitter Stream consumer-1 [建立连接])[INFO  -  twitter4j.internal.logging.SLF4JLogger.info(SLF4JLogger.java: 83)] 404:请求的URI无效或请求的资源(如用户)不存在。未知的网址。请参阅http://dev.twitter.com/pages/streaming_api上的Twitter Streaming API文档。b $ b  

这是我的flume-twitter.conf文件位于flume / conf文件夹中:

  TwitterAgent.sources = Twitter TwitterAgent.channels = MemChannel TwitterAgent。汇= HDFS TwitterAgent.sources.Twitter.type = com.cloudera.flume.source.TwitterSource TwitterAgent.sources.Twitter.channels = MemChannel TwitterAgent.sources.Twitter.consumerKey = jtlmThaz307pCCQtlw9lvrrOq TwitterAgent.sources.Twitter.consumerSecret = oaGCt6OaUas13Ji5NTnPN6TFjdSKtsAUQdq4ZhAq0BFn9jgHPU TwitterAgent.sources .Twitter.accessToken = 921523328-xxY9nrWijDSVC77iK40eRNVmRIopvLXovpoxBnDs TwitterAgent.sources.Twitter.accessTokenSecret = fbtuDENfBNxTooPD0EEgEo15Pg51cxNQa1CochI56gqSO TwitterAgent.sources.Twitter.keywords = WT20,Hadoop的,选举,运动,板球,大数据,IPL2016,Panamaleaks,Pollingday TwitterAgent.sinks.HDFS.channel = MemChannel TwitterAgent.sinks.HDFS.type = hdfs TwitterAgent.sinks.HDFS.hdfs.path = h dfs:// HadoopMaster:9000 / user / flume / tweets TwitterAgent.sinks.HDFS.hdfs.fileType = DataStream TwitterAgent.sinks.HDFS.hdfs.writeformat =文字TwitterAgent.sinks.HDFS.hdfs.batchSize = 1000 TwitterAgent.sinks。 HDAgent.html.txt = 0 TwitterAgent.sinks.HDFS.hdfs.rollCount = 10000 TwitterAgent.sinks.HDFS.hdfs.rollInterval = 600 TwitterAgent.channels.MemChannel.type =内存TwitterAgent.channels.MemChannel.capacity = 10000 TwitterAgent.channels .MemChannel.transactionCapacity = 100 * 


解决方案

em> flume-sources-1.x-SNAPSHOT.jar 与从链接

Twitter在几天前打破了旧的API。旧的jar文件不起作用。您可以从我上面给出的链接下载修改过的jar。



我通过这种方法获得结果。


bin/flume-ng agent -n TwitterAgent  --conf ./conf/ -f conf/flume-twitter.conf -Dflume.root.logger=DEBUG,console

When I run the above command it generate the following errors:

2016-05-06 13:33:31,357 (Twitter Stream consumer-1[Establishing connection]) [INFO - twitter4j.internal.logging.SLF4JLogger.info(SLF4JLogger.java:83)] 404:The URI requested is invalid or the resource requested, such as a user, does not exist. Unknown URL. See Twitter Streaming API documentation at http://dev.twitter.com/pages/streaming_api

This is my flume-twitter.conf file located in flume/conf folder:

TwitterAgent.sources= Twitter TwitterAgent.channels= MemChannel TwitterAgent.sinks=HDFS TwitterAgent.sources.Twitter.type = com.cloudera.flume.source.TwitterSource TwitterAgent.sources.Twitter.channels=MemChannel TwitterAgent.sources.Twitter.consumerKey=jtlmThaz307pCCQtlw9lvrrOq TwitterAgent.sources.Twitter.consumerSecret=oaGCt6OaUas13Ji5NTnPN6TFjdSKtsAUQdq4ZhAq0BFn9jgHPU TwitterAgent.sources.Twitter.accessToken=921523328-xxY9nrWijDSVC77iK40eRNVmRIopvLXovpoxBnDs TwitterAgent.sources.Twitter.accessTokenSecret=fbtuDENfBNxTooPD0EEgEo15Pg51cxNQa1CochI56gqSO TwitterAgent.sources.Twitter.keywords= WT20,hadoop,election,sports, cricket,Big data,IPL2016,Panamaleaks,Pollingday TwitterAgent.sinks.HDFS.channel=MemChannel TwitterAgent.sinks.HDFS.type=hdfs TwitterAgent.sinks.HDFS.hdfs.path=hdfs://HadoopMaster:9000/user/flume/tweets TwitterAgent.sinks.HDFS.hdfs.fileType=DataStream TwitterAgent.sinks.HDFS.hdfs.writeformat=Text TwitterAgent.sinks.HDFS.hdfs.batchSize=1000 TwitterAgent.sinks.HDFS.hdfs.rollSize=0 TwitterAgent.sinks.HDFS.hdfs.rollCount=10000 TwitterAgent.sinks.HDFS.hdfs.rollInterval=600 TwitterAgent.channels.MemChannel.type=memory TwitterAgent.channels.MemChannel.capacity=10000 TwitterAgent.channels.MemChannel.transactionCapacity=100*

解决方案

Try replacing your flume-sources-1.x-SNAPSHOT.jar with the jar file downloaded from this link.

As Twitter broke their old APIs few days ago. The old jar file will not work. You can Download the modified jar from the link I have given above.

P.S. I am getting results through this method.

这篇关于无法通过水槽从twitter下载数据的文章就介绍到这了,希望我们推荐的答案对大家有所帮助,也希望大家多多支持IT屋!

查看全文
登录 关闭
扫码关注1秒登录
发送“验证码”获取 | 15天全站免登陆