PySpark Kafka 错误:缺少应用程序资源 [英] PySpark Kafka Error: Missing application resource

查看:61
本文介绍了PySpark Kafka 错误:缺少应用程序资源的处理方法,对大家解决问题具有一定的参考价值,需要的朋友们下面随着小编来一起学习吧!

问题描述

当我在代码中添加以下依赖项时会触发以下错误,

Below error is triggered when i added the below dependency to the code,

'--packages org.apache.spark:spark-sql-kafka-0-10_2.11:2.2.0,org.apache.spark:spark-streaming-kafka-0-8-assembly_2.11:2.1.1'

下面是代码,

from pyspark.sql import SparkSession, Row
from pyspark.context import SparkContext
from kafka import KafkaConsumer
import os

os.environ['PYSPARK_SUBMIT_ARGS'] = '--packages org.apache.spark:spark-sql-kafka-0-10_2.11:2.2.0,org.apache.spark:spark-streaming-kafka-0-8-assembly_2.11:2.1.1'


sc = SparkContext.getOrCreate()
spark = SparkSession(sc)

df = spark \
  .read \
  .format("kafka") \
  .option("kafka.bootstrap.servers", "localhost:9092") \
  .option("subscribe", "Jim_Topic") \
  .load()
df.selectExpr("CAST(key AS STRING)", "CAST(value AS STRING)")

下面是错误,

错误:缺少应用程序资源.

Error: Missing application resource.

用法:spark-submit [options] [app arguments]用法:spark-submit --kill [提交ID] --master [spark://...]用法:spark-submit --status [提交 ID] --master [spark://...]用法:spark-submit run-example [options] example-class [example args]

Usage: spark-submit [options] [app arguments] Usage: spark-submit --kill [submission ID] --master [spark://...] Usage: spark-submit --status [submission ID] --master [spark://...] Usage: spark-submit run-example [options] example-class [example args]

推荐答案

您还需要提供 Python 文件的名称.

You also need to provide the name of your python file.

os.environ['PYSPARK_SUBMIT_ARGS'] = '--packages org.apache.spark:spark-sql-kafka-0-10_2.11:2.2.0,org.apache.spark:spark-streaming-kafka-0-8-assembly_2.11:2.1.1 your_python_file.py'

<小时>

或者,更好的方法是:


Alternatively, a nicer way would be:

conf = SparkConf().set("spark.jars", "/path/to/your/jar")
sc = SparkContext(conf=conf)

这篇关于PySpark Kafka 错误:缺少应用程序资源的文章就介绍到这了,希望我们推荐的答案对大家有所帮助,也希望大家多多支持IT屋!

查看全文
登录 关闭
扫码关注1秒登录
发送“验证码”获取 | 15天全站免登陆