MarkLogic Cluster-使用所有文档配置林 [英] MarkLogic Cluster - Configure Forest with all documents

查看:91
本文介绍了MarkLogic Cluster-使用所有文档配置林的处理方法,对大家解决问题具有一定的参考价值,需要的朋友们下面随着小编来一起学习吧!

问题描述

我们正在研究MarkLogic 9.0.8.2

我们正在Azure上设置MarkLogic群集(3个VM),并且根据故障转移设计,希望在Azure Blob中拥有3个林(每个用于节点).

我完成了安装程序,开始摄取后发现文件分布在3个目录林中,而不是全部存储在每个目录林中.

例如

我摄取了30000条记录,每个森林包含10000条记录.

我需要的是拥有30000条记录的所有林.

我是否需要任何配置(在数据库或林级别)?

解决方案

MarkLogic与某些其他noSQL文档数据库故障转移的工作方式不同,后者可能会在每个主机上保留每个文档的副本.

MarkLogic的集群性质将文档分布在主机之间,以实现可用性和资源消耗之间的平衡.为了进行故障转移保护,您必须在每个主机上创建其他目录林,并将它们作为副本附加到现有目录林中.这样可以确保在3台主机中的任何1台发生故障时都可以使用.

这是示例林布局:

Host 1:    primary_forest_01     replica_forest_03
Host 2:    primary_forest_02     replica_forest_01
Host 3:    primary_forest_03     replica_forest_02

副本林必须位于与主林不同的主机上,并且如果每个主机有多个林,则应在主机之间对它们进行条带化,以最佳地平衡故障转移时的资源消耗.

同样重要的是要注意,对于HA,您还需要为系统数据库配置副本.

因此,没有数据库设置可将所有文档放置在每台主机上,因为MarkLogic并不是这样设计的. 可伸缩性,可用性和故障转移指南非常有用,在这种情况下,具有故障转移功能的数据节点的高可用性部分尤其重要.我还高度建议您查看MarkLogic提供的免费培训. /p>

We are working on MarkLogic 9.0.8.2

We are setting up MarkLogic Cluster (3 VMs) on Azure and as per failover design, want to have 3 forests (each for Node) in Azure Blob.

I am done with Setup and when started ingestion, i found that documents are distributed across 3 forests and not stored all in each Forest.

For e.g.

i ingested 30000 records and each forest contains 10000 records.

What i need is to have all forest with 30000 records.

Is there any configuration (at DB or forest level) i need to achieve this?

解决方案

MarkLogic does not work the same as some of the other noSQL document databases failover which may keep a copy of every document on each host.

The clustered nature of MarkLogic distributes the documents across the hosts to provide a balance of availability and resource consumption. For failover protection, you must create additional forests on each host and attach them to your existing forests as replicas. This ensures availability should any 1 of the 3 hosts fail.

Here is a sample forest layout:

Host 1:    primary_forest_01     replica_forest_03
Host 2:    primary_forest_02     replica_forest_01
Host 3:    primary_forest_03     replica_forest_02

The replica forest must be on a different host than the primary forest, and if there are multiple forests per host, they should be striped across hosts to best balance out resource consumption when failed over.

It's also important to note that for HA, you need replicas configured for the system databases as well.

So there is no database setting to put all the documents on every hosts, because that is not the way MarkLogic is designed to work. The Scalability, Availability and Failover Guide is very informative, and in this case, the High Availability of Data Nodes with Failover section is particularly relevant. I also highly recommend checking out the free training that MarkLogic offers.

这篇关于MarkLogic Cluster-使用所有文档配置林的文章就介绍到这了,希望我们推荐的答案对大家有所帮助,也希望大家多多支持IT屋!

查看全文
登录 关闭
扫码关注1秒登录
发送“验证码”获取 | 15天全站免登陆