将Spark连接到多个Mongo集合 [英] Connecting Spark to Multiple Mongo Collections

查看:591
本文介绍了将Spark连接到多个Mongo集合的处理方法,对大家解决问题具有一定的参考价值,需要的朋友们下面随着小编来一起学习吧!

问题描述

我有以下MongoDB集合:employeedetails.

I have the following MongoDB Collections : employee and details.

现在我有一个要求,我必须将两个集合中的文档放到火花中以分析数据.

Now I have a requirement where I have to get documents from both collections into spark to analyze data.

我尝试了下面的代码,但似乎不起作用

I tried below code but it seems not working

SparkConf conf = new SparkConf().setAppName("DBConnection").setMaster("local[*]")
                .set("spark.app.id","MongoSparkExample")
                .set("spark.mongodb.input.uri","mongodb://localhost/Emp.employee")
                .set("spark.executor.memory", "6g");

SparkSession session = SparkSession.builder().appName("Member Log")
                .config(conf).getOrCreate();

SparkConf dailyconf = new SparkConf().setAppName("DBConnection").setMaster("local[*]")
                .set("spark.app.id","Mongo Two Example")
                .set("spark.mongodb.input.uri","mongodb://localhost/Emp.details");

SparkSession mongosession = SparkSession.builder().appName("Daily Log")
                .config(dailyconf).getOrCreate();

任何指针都将受到高度赞赏.

Any pointers would be highly appreciated.

推荐答案

我通过添加以下代码

JavaSparkContext newcontext = new JavaSparkContext(session.sparkContext());
Map<String, String> readOverrides = new HashMap<String, String>();
readOverrides.put("collection", "details");
readOverrides.put("readPreference.name", "secondaryPreferred");
ReadConfig readConfig = ReadConfig.create(newcontext).withOptions(readOverrides);
MongoSpark.load(newcontext,readConfig);

这篇关于将Spark连接到多个Mongo集合的文章就介绍到这了,希望我们推荐的答案对大家有所帮助,也希望大家多多支持IT屋!

查看全文
登录 关闭
扫码关注1秒登录
发送“验证码”获取 | 15天全站免登陆