Spark如何加快对JanusGraph的批量加载? [英] How Spark can speed up bulk loading to JanusGraph?

查看:163
本文介绍了Spark如何加快对JanusGraph的批量加载?的处理方法,对大家解决问题具有一定的参考价值,需要的朋友们下面随着小编来一起学习吧!

问题描述

我需要从其他存储设备向具有Cassandra后端的JanusGraph加载大量顶点和边.我已经阅读了有关批量加载和Spark配置的信息( https://docs.janusgraph.org/advanced-topics/bulk-loading/ https://docs.janusgraph.org/advanced-topics/hadoop/).

I need to load lots of vertices and edges to JanusGraph with Cassandra backend from other storage. I've read about bulk loading and Spark configuring (https://docs.janusgraph.org/advanced-topics/bulk-loading/ and https://docs.janusgraph.org/advanced-topics/hadoop/) .

很明显如何配置JanusGraph来使用Spark,但我仍然不确定如何使用Spark,以及Spark是否可以帮助加快插入图形的速度.

It's clear how to configure JanusGraph for Spark usage but I'm still not sure how to use Spark then and if Spark can help to speed up inserting into graph.

请提供一些使用案例和代码示例,这些示例和代码示例使用Hadoop MapReduce或Spark加快将数据批量加载到Janusgraph的速度(首选Java或Python).欢迎任何帮助!

Please give some use cases and code example of using Hadoop MapReduce or Spark to speed up bulk loading data to Janusgraph (Java or Python are preferred). Any help welcome!

推荐答案

我最近在POC项目上工作,使用Apache Spark将数据批量加载到JanusGraph中.在使用Spark加载数据时,我们获得了相当不错的性能.设置和示例代码在下面的文章中提供.

I worked on POC project recently to Bulk Load data into JanusGraph using Apache Spark. We were getting pretty good performance loading data into using Spark. Setup and sample code is provided in the article below.

https://medium.com/@nitinpoddar/bulk-将数据加载到janusgraph-ace7d146af05

https://medium.com/@ nitinpoddar/bulk-loading-data-into-janusgraph-part-2-ca946db26582

这篇关于Spark如何加快对JanusGraph的批量加载?的文章就介绍到这了,希望我们推荐的答案对大家有所帮助,也希望大家多多支持IT屋!

查看全文
登录 关闭
扫码关注1秒登录
发送“验证码”获取 | 15天全站免登陆