应该创建多个文档类型还是多个索引? [英] Should create multiple document types or multiple indexes?

查看:126
本文介绍了应该创建多个文档类型还是多个索引?的处理方法,对大家解决问题具有一定的参考价值,需要的朋友们下面随着小编来一起学习吧!

问题描述

我们为企业提供了很多网站,每个企业都有许多他们可能希望通过ES进行索引和搜索的文档类型。

We host lots of websites for businesses, each business will have a number of document types they may want to get indexed and searched via ES.

通常,该数量每个企业拥有的文档类型少于20个,每种类型可能少于10万个文档(通常少得多)。

Normally, the number of document types each business has is less than 20, each type may have less than 100k documents (usually much less).

我不确定我应该如何设置这些网站的数据?我应该将它们放在单独的索引中,还是应该将它们全部塞入具有不同文档类型的同一索引中?还是有其他东西?

I'm not sure how I should setup the data for these websites? Should I put them into separate index, or should I jam them all into the same index with different document types? Or if there is something else?

也许,我什至应该对中小型网站建立不同的索引?如果计划扩展到5万个站点,应该准备哪些最坏的情况?

Or perhaps, I should even go as far as indexing small and medium sites differently? What are some worst case scenarios I should be prepared for if I plan to grow to 50K sites?

推荐答案

如果您创建具有多个映射类型的索引,则将有很大的约束,要求您确保没有字段在两个不同的映射类型中具有相同名称的对象具有两种不同的类型,即,不能有一个名为 blablaCount 的字段,该字段是一个 long ,另一种映射类型为 double

If you create one index with several mapping types, you will have a big constraint that requires you to make sure that no fields with the same name in two different mapping types have two different types, i.e. you can't have a field named blablaCount being a long in one mapping type and a double in another mapping type within the same index.

您的里程可能有所不同,但是由于ES 2.0和出色的映射重构,因此通常建议包含多个索引和每个索引一个映射类型。

Your mileage may vary, but since ES 2.0 and the great mapping refactoring, it is usually recommended to go with several indices and one mapping type per index.

我要做的是创建多个索引以及每个索引一个映射/文档类型,然后您只需使用别名,这样,如果您需要查询给定业务的所有索引,则只需查询该业务的别名即可。

What I would do is to create several indices and one mapping/document type per index, then you'd simply group all indices belonging to a given business with an alias, so that if you need to query all indices of a given business, you can simply query the alias for that business.

另一种选择是将所有业务的所有文档置于同一组索引中,并使用 term 查询简单地区分每个业务其 businessId 字段,甚至是路由放在 businessId 上。

Another option is to put all documents of all businesses in the same set of indices and simply discriminate each business using a term query on its businessId field, or even by routing on the businessId.

但是,在您的情况下,由于每个企业没有那么多文档,为每个企业创建全套索引可能会浪费资源,因此我可能会选择第二种选择,即为每个企业创建一组索引具有自己的映射/文档类型,然后将所有业务的所有文档存储在这些索引中。

However, in your case, since each business doesn't have that many documents, it might be a waste of resource to create a full set of indices for each business, so I'd probably go with the second option, i.e. create a set of indices, each with its own mapping/document types and then store all documents from all business in those indices.

这篇关于应该创建多个文档类型还是多个索引?的文章就介绍到这了,希望我们推荐的答案对大家有所帮助,也希望大家多多支持IT屋!

查看全文
登录 关闭
扫码关注1秒登录
发送“验证码”获取 | 15天全站免登陆