Spark-如何为每个执行程序上下文创建不同的变量? [英] Spark - How to create a variable that is different for each executor context?

查看：77 发布时间：2020/5/11 2:44:03 java mongodb apache-spark

本文介绍了Spark-如何为每个执行程序上下文创建不同的变量?的处理方法，对大家解决问题具有一定的参考价值，需要的朋友们下面随着小编来一起学习吧！

问题描述

我的Spark应用程序启动了多个执行程序. 我有几个分区遍布我的执行者.

My Spark application launches several executors. I have several partitions that get spread over my executors.

在这些分区上使用map()时，我想使用MongoDB连接( MongoDB Java驱动程序)并从那里查询更多数据，处理这些数据并将其作为map()函数的输出返回.

When using map() on these partitions, I want to use a MongoDB connection (MongoDB Java Driver) and query more data from there, process this data and return it as the output of the map() function.

我想为每个执行者创建一个连接. 然后，每个分区都应访问此executor-local变量并将其用于查询数据.

I want to create one connection per executor. Each partition should then access this executor-local variable and use it to query the data.

为每个分区建立连接可能不是一个好主意.广播连接也不起作用，因为它不可序列化(我认为吗?).

Establishing a connection for each partition is probably not a good idea. Broadcasting the connection won't work either because it is not serializable (I think?).

总结一下:

如何为每个执行者上下文创建一个不同的变量?

Spark-如何为每个执行程序上下文创建不同的变量? [英] Spark - How to create a variable that is different for each executor context?

问题描述

推荐答案

相关文章

Java开发最新文章

热门教程

热门工具

登录关闭

Spark-如何为每个执行程序上下文创建不同的变量? [英] Spark - How to create a variable that is different for each executor context?

问题描述

推荐答案

相关文章

Java开发最新文章

热门教程

热门工具

登录 关闭

登录关闭