Kafka Stream到Spark Stream python [英] Kafka Stream to Spark Stream python

查看:261
本文介绍了Kafka Stream到Spark Stream python的处理方法,对大家解决问题具有一定的参考价值,需要的朋友们下面随着小编来一起学习吧!

问题描述

我们有使用Avro的Kafka流.我需要将其连接到Spark Stream. 我将波纹管代码用作 Lev G 建议.

We have Kafka stream which use Avro. I need to connect it to Spark Stream. I use bellow code as Lev G suggest.

kvs = KafkaUtils.createDirectStream(ssc, [topic], {"metadata.broker.list": brokers}, valueDecoder=MessageSerializer.decode_message) 

通过spark-submit执行时出现波纹管错误.

I got bellow error when i execute it through spark-submit.

2018-10-09 10:49:27 WARN YarnSchedulerBackend $ YarnSchedulerEndpoint:66-由于以下原因而请求驱动程序删除执行程序12:容器标记为失败:host_1_server_name上的container_1537396420651_0008_01_000013.退出状态:1.诊断:[2018-10-09 10:49:25.810]容器启动异常. 容器编号:container_1537396420651_0008_01_000013 退出代码:1

2018-10-09 10:49:27 WARN YarnSchedulerBackend$YarnSchedulerEndpoint:66 - Requesting driver to remove executor 12 for reason Container marked as failed: container_1537396420651_0008_01_000013 on host: server_name. Exit status: 1. Diagnostics: [2018-10-09 10:49:25.810]Exception from container-launch. Container id: container_1537396420651_0008_01_000013 Exit code: 1

[2018-10-09 10:49:25.810]

[2018-10-09 10:49:25.810]

[2018-10-09 10:49:25.811]容器退出,退出代码为非零1.错误文件:prelaunch.err. prelaunch.err的最后4096个字节:

[2018-10-09 10:49:25.811]Container exited with a non-zero exit code 1. Error file: prelaunch.err. Last 4096 bytes of prelaunch.err :

最后4096个字节的stderr:

Last 4096 bytes of stderr :

Java HotSpot(TM)64位服务器VM警告:INFO:os :: commit_memory(0x00000000d5580000,702545920,0)失败; error ='无法分配内存'(errno = 12)

Java HotSpot(TM) 64-Bit Server VM warning: INFO: os::commit_memory(0x00000000d5580000, 702545920, 0) failed; error='Cannot allocate memory' (errno=12)

[2018-10-09 10:49:25.822]

[2018-10-09 10:49:25.822]

[2018-10-09 10:49:25.822]容器退出,退出代码为非零1.错误文件:prelaunch.err.

[2018-10-09 10:49:25.822]Container exited with a non-zero exit code 1. Error file: prelaunch.err.

prelaunch.err的最后4096个字节: stderr的最后4096个字节:

Last 4096 bytes of prelaunch.err : Last 4096 bytes of stderr :

Java HotSpot(TM)64位服务器VM警告:INFO:os :: commit_memory(0x00000000d5580000,702545920,0)失败; error ='无法分配内存'(errno = 12)

Java HotSpot(TM) 64-Bit Server VM warning: INFO: os::commit_memory(0x00000000d5580000, 702545920, 0) failed; error='Cannot allocate memory' (errno=12)

我使用了波纹管命令.

spark-submit --master yarn --py-files ${BIG_DATA_LIBS}v3io-py.zip --packages org.apache.spark:spark-streaming-kafka-0-8_2.11:2.2.0 --jars ${BIG_DATA_LIBS}v3io-hcfs_2.11.jar,${BIG_DATA_LIBS}v3io-spark2-object-dataframe_2.11.jar,${BIG_DATA_LIBS}v3io-spark2-streaming_2.11.jar ${APP_PATH}/${SCRIPT_PATH}/kafka_to_spark_stream.py

所有变量均正确导出.这是什么错误?

All Variables are export correctly. What is this this error?

推荐答案

是不是您没有在驱动程序/执行程序上分配足够的内存来处理流?

Could it be that you don't allocate enough memory on the driver/executors to handle the stream?

这篇关于Kafka Stream到Spark Stream python的文章就介绍到这了,希望我们推荐的答案对大家有所帮助,也希望大家多多支持IT屋!

查看全文
登录 关闭
扫码关注1秒登录
发送“验证码”获取 | 15天全站免登陆