在 docker 中运行时,python 脚本无法导入 kafka 库 [英] a python script fails to import kafka library while running inside docker

查看:43
本文介绍了在 docker 中运行时,python 脚本无法导入 kafka 库的处理方法,对大家解决问题具有一定的参考价值,需要的朋友们下面随着小编来一起学习吧!

问题描述

我有以下 Python 脚本,可以从 twitter 中提取推文并将其发送到 kafka 主题.该脚本运行完美,但是当我尝试在 docker 容器中运行它时,它无法导入 kafka 库.它说语法错误:语法无效".

I have the following python script that pulls tweets from twitter and sends it to a kafka topic. The script runs perfectly, but when I try to run it inside a docker container, it fails to import the kafka library. It says "SyntaxError: invalid syntax".

以下是python脚本(twitter_app.py)的内容:

Following is content of the python script(twitter_app.py):

import socket
import sys
import requests
import requests_oauthlib
import json
import kafka
from kafka import KafkaProducer
import time
from kafka import SimpleProducer
from kafka import KafkaClient

###################################################
# My own twitter access tokens
####################################################
ACCESS_TOKEN = '28778811-XXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXX'
ACCESS_SECRET = 'HBGjTXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXX'
CONSUMER_KEY = '#################################'
CONSUMER_SECRET = '############################################'

my_auth = requests_oauthlib.OAuth1(CONSUMER_KEY, CONSUMER_SECRET,ACCESS_TOKEN, ACCESS_SECRET)

####################################################
# Kafka Producer
####################################################
twitter_topic="twitter_topic"
client = KafkaClient("10.142.0.2:9092")
producer = SimpleProducer(client)
#producer = kafka.KafkaProducer(bootstrap_servers='10.128.0.2:9092')

def get_tweets():
    print("#########################get_tweets called################################")
    url = 'https://stream.twitter.com/1.1/statuses/filter.json'
    #query_data = [('language', 'en'), ('locations', '-130,-20,100,50'),('track' ,'#')]
    #query_data = [('language', 'en'), ('locations', '-3.7834,40.3735,-3.6233,40.4702'),('track','#')]
    query_data = [('language', 'en'), ('locations', '-3.7834,40.3735,-3.6233,40.4702'),('track','Madrid')]
    query_url = url + '?' + '&'.join([str(t[0]) + '=' + str(t[1]) for t in query_data])
    #print("Query url is", query_url)
    response = requests.get(query_url, auth=my_auth, stream=True)
    print(query_url, response)
    return response

def send_tweets_to_kafka(http_resp):
    print("########################send_tweets_to_kafka called#################################")
    for line in http_resp.iter_lines():
        print("reading tweets")
        try:
            full_tweet = json.loads(line)
            tweet_text = full_tweet['text']
            print("Tweet Text: " + tweet_text)
            print ("------------------------------------------")
            tweet_text = tweet_text + '\n'
            producer.send_messages(twitter_topic, tweet_text.encode())
            #producer.send(twitter_topic, tweet_text.encode())
            time.sleep(0.2)
        except:
            print("Error received")
            e = sys.exc_info()[0]
            print("Error: %s" % e)
    print("Done reading tweets")

##############
# Actual Execution starts here
###############
resp = get_tweets()
send_tweets_to_kafka(resp)

但是,现在我尝试在 docker 容器中运行此脚本,但它失败了,并且出现以下错误:

However, now I am trying to run this script inside a docker container, but it fails, and I get the following error:

Traceback (most recent call last):
  File "twitter_app.py", line 6, in <module>
    import kafka
  File "/usr/local/lib/python3.7/site-packages/kafka/__init__.py", line 23, in <module>
    from kafka.producer import KafkaProducer
  File "/usr/local/lib/python3.7/site-packages/kafka/producer/__init__.py", line 4, in <module>
    from .simple import SimpleProducer
  File "/usr/local/lib/python3.7/site-packages/kafka/producer/simple.py", line 54
    return '<SimpleProducer batch=%s>' % self.async
                                                  ^
SyntaxError: invalid syntax

供您参考,以下是 Dockerfile 的内容(请注意,当我使用同一个 Dockerfile 和一个没有使用 kafka 的简单脚本时,它运行得非常好):

For your reference, following are the contents of Dockerfile (Please note that, when I used the same Dockerfile with a simple script that was not using kafka, it worked perfectly fine):

FROM python:3
MAINTAINER kamal.nandan@<myemailservice>

RUN apt-get update
RUN apt-get install -y python3
RUN pip install requests
RUN pip install requests_oauthlib
RUN pip install kafka

ADD twitter_app.py /
CMD python3 twitter_app.py

过去几天我一直在与它斗争,但我一直无法弄清楚问题所在.任何帮助将非常感激.提前致谢.

I have been fighting with it for the past few days, but I haven't been able to figure out the issue. Any help would be much appreciated. Thanks in advance.

推荐答案

该错误仅在 python 3.7 中出现,因为不兼容的更改,async 是这个版本的保留关键字.

The error occurs only in python 3.7, because of an incompatible change, async is a reserved keyword since this version.

解决方案是继续使用python 3.6,直到库适应新版本,有一个已经关闭的问题:

The solution is to keep using python 3.6 until the library is adapted to the new version, there is an already closed issue:

FROM python:3.6
MAINTAINER kamal.nandan@<myemailservice>

RUN pip install requests requests_oauthlib kafka

ADD twitter_app.py /
CMD python3 twitter_app.py

(我冒昧地减少了 Dockerfile)

(I took the liberty of reducing the Dockerfile)

这篇关于在 docker 中运行时,python 脚本无法导入 kafka 库的文章就介绍到这了,希望我们推荐的答案对大家有所帮助,也希望大家多多支持IT屋!

查看全文
登录 关闭
扫码关注1秒登录
发送“验证码”获取 | 15天全站免登陆