在docker中运行时,python脚本无法导入kafka库 [英] a python script fails to import kafka library while running inside docker

查看:289
本文介绍了在docker中运行时,python脚本无法导入kafka库的处理方法,对大家解决问题具有一定的参考价值,需要的朋友们下面随着小编来一起学习吧!

问题描述

我有以下python脚本,可从twitter提取推文并将其发送到kafka主题。该脚本可以完美运行,但是当我尝试在docker容器中运行该脚本时,它无法导入kafka库。它说 SyntaxError:无效语法。

I have the following python script that pulls tweets from twitter and sends it to a kafka topic. The script runs perfectly, but when I try to run it inside a docker container, it fails to import the kafka library. It says "SyntaxError: invalid syntax".

以下是python脚本的内容(twitter_app.py):

Following is content of the python script(twitter_app.py):

import socket
import sys
import requests
import requests_oauthlib
import json
import kafka
from kafka import KafkaProducer
import time
from kafka import SimpleProducer
from kafka import KafkaClient

###################################################
# My own twitter access tokens
####################################################
ACCESS_TOKEN = '28778811-XXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXX'
ACCESS_SECRET = 'HBGjTXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXX'
CONSUMER_KEY = '#################################'
CONSUMER_SECRET = '############################################'

my_auth = requests_oauthlib.OAuth1(CONSUMER_KEY, CONSUMER_SECRET,ACCESS_TOKEN, ACCESS_SECRET)

####################################################
# Kafka Producer
####################################################
twitter_topic="twitter_topic"
client = KafkaClient("10.142.0.2:9092")
producer = SimpleProducer(client)
#producer = kafka.KafkaProducer(bootstrap_servers='10.128.0.2:9092')

def get_tweets():
    print("#########################get_tweets called################################")
    url = 'https://stream.twitter.com/1.1/statuses/filter.json'
    #query_data = [('language', 'en'), ('locations', '-130,-20,100,50'),('track' ,'#')]
    #query_data = [('language', 'en'), ('locations', '-3.7834,40.3735,-3.6233,40.4702'),('track','#')]
    query_data = [('language', 'en'), ('locations', '-3.7834,40.3735,-3.6233,40.4702'),('track','Madrid')]
    query_url = url + '?' + '&'.join([str(t[0]) + '=' + str(t[1]) for t in query_data])
    #print("Query url is", query_url)
    response = requests.get(query_url, auth=my_auth, stream=True)
    print(query_url, response)
    return response

def send_tweets_to_kafka(http_resp):
    print("########################send_tweets_to_kafka called#################################")
    for line in http_resp.iter_lines():
        print("reading tweets")
        try:
            full_tweet = json.loads(line)
            tweet_text = full_tweet['text']
            print("Tweet Text: " + tweet_text)
            print ("------------------------------------------")
            tweet_text = tweet_text + '\n'
            producer.send_messages(twitter_topic, tweet_text.encode())
            #producer.send(twitter_topic, tweet_text.encode())
            time.sleep(0.2)
        except:
            print("Error received")
            e = sys.exc_info()[0]
            print("Error: %s" % e)
    print("Done reading tweets")

##############
# Actual Execution starts here
###############
resp = get_tweets()
send_tweets_to_kafka(resp)

但是,现在我正在尝试运行此脚本在Docker容器内,但失败,并且出现以下错误:

However, now I am trying to run this script inside a docker container, but it fails, and I get the following error:

Traceback (most recent call last):
  File "twitter_app.py", line 6, in <module>
    import kafka
  File "/usr/local/lib/python3.7/site-packages/kafka/__init__.py", line 23, in <module>
    from kafka.producer import KafkaProducer
  File "/usr/local/lib/python3.7/site-packages/kafka/producer/__init__.py", line 4, in <module>
    from .simple import SimpleProducer
  File "/usr/local/lib/python3.7/site-packages/kafka/producer/simple.py", line 54
    return '<SimpleProducer batch=%s>' % self.async
                                                  ^
SyntaxError: invalid syntax

供您参考,以下是Dockerfile的内容(请注意,当我将同一个Dockerfile与不使用kafka的简单脚本一起使用时,它工作得很好):

For your reference, following are the contents of Dockerfile (Please note that, when I used the same Dockerfile with a simple script that was not using kafka, it worked perfectly fine):

FROM python:3
MAINTAINER kamal.nandan@<myemailservice>

RUN apt-get update
RUN apt-get install -y python3
RUN pip install requests
RUN pip install requests_oauthlib
RUN pip install kafka

ADD twitter_app.py /
CMD python3 twitter_app.py

在过去的几天里,我一直在与之抗争,但我一直无法弄清这个问题。任何帮助将非常感激。

I have been fighting with it for the past few days, but I haven't been able to figure out the issue. Any help would be much appreciated. Thanks in advance.

推荐答案

由于不兼容的更改异步是此版本的保留关键字

解决方案是继续使用python 3.6,直到该库适应新版本为止,已经关闭了问题

The solution is to keep using python 3.6 until the library is adapted to the new version, there is an already closed issue:

FROM python:3.6
MAINTAINER kamal.nandan@<myemailservice>

RUN pip install requests requests_oauthlib kafka

ADD twitter_app.py /
CMD python3 twitter_app.py

(我冒昧地减少了Dockerfile)

(I took the liberty of reducing the Dockerfile)

这篇关于在docker中运行时,python脚本无法导入kafka库的文章就介绍到这了,希望我们推荐的答案对大家有所帮助,也希望大家多多支持IT屋!

查看全文
登录 关闭
扫码关注1秒登录
发送“验证码”获取 | 15天全站免登陆