将Solr HDFS数据复制到另一个群集 [英] Copy Solr HDFS Data to another Cluster

查看:336
本文介绍了将Solr HDFS数据复制到另一个群集的处理方法,对大家解决问题具有一定的参考价值,需要的朋友们下面随着小编来一起学习吧!

问题描述

我有一个solr云(v 4.10)安装,位于Cloudera(CDH 5.4.2)HDFS之上,每个虚拟机包含3个solr实例,每个实例都包含每个核心的碎片。
我正在寻找一种方法来逐步将solr数据从我们的生产集群复制到我们的开发集群。有3个核心,但我只是有兴趣复制其中的一个。



我曾尝试使用Solr复制 - 备份和还原,但似乎没有将任何内容加载到开发群集中。

  http:// host:8983 / solr / core / replication?command = backup& location = / solr_transfer& name = core-name 
http:/ / host:8983 / solr / core / replication?command = restore& location = / solr_transfer& name = core-name

我还尝试在hdfs prod集群中对/ solr目录进行快照,并使用hadoop disctp复制文件,但solr索引器删除了一些文件,因此distcp作业失败。

  hadoop distcp hftp:// prod:50070 / solr / * hdfs:// dev:8020 / solr / 

任何人都可以在这里协助我吗?

解决方案

很多尝试这是我们制定的解决方案。
- 在第二个环境中初始化所有集合中的solr,方式与主要相同。
- 获取HDFS
的快照 - 使用hadoop hdfs -cp将数据复制到检查点
第一次运行后,复制作业会很快,因为您只复制增量。

I have a solr cloud (v 4.10) installation that sits on top of Cloudera (CDH 5.4.2) HDFS with 3 solr instances each hosting a shard of each core. I am looking for a way to incrementally copy the solr data from our production cluster to our development cluster. There are 3 cores but I am only interested in copying one of them.

I have tried to use the Solr replication - backup and restore but that doesn't seem to load anything into the dev cluster.

http://host:8983/solr/core/replication?command=backup&location=/solr_transfer&name=core-name
http://host:8983/solr/core/replication?command=restore&location=/solr_transfer&name=core-name

I also tried to snapshot the /solr dir in the hdfs prod clusters and use hadoop disctp to copy the files but the solr indexer deletes some of the files so the distcp job fails.

hadoop distcp hftp://prod:50070/solr/* hdfs://dev:8020/solr/

Can anyone help me here?

解决方案

After a lot of trying this is the solution we worked out. - Initialise solr in the second environment with all the collections in the same way as the primary. - Take a snapshot of HDFS - Use hadoop hdfs -cp to copy the data up to the checkpoint After the first run the copy job will be quick as you are only copying the increments.

这篇关于将Solr HDFS数据复制到另一个群集的文章就介绍到这了,希望我们推荐的答案对大家有所帮助,也希望大家多多支持IT屋!

查看全文
登录 关闭
扫码关注1秒登录
发送“验证码”获取 | 15天全站免登陆