Elasticsearch 中的索引是什么 [英] What is an index in Elasticsearch

查看:32
本文介绍了Elasticsearch 中的索引是什么的处理方法,对大家解决问题具有一定的参考价值,需要的朋友们下面随着小编来一起学习吧!

问题描述

Elasticsearch 中的索引是什么?一个应用程序有多个索引还是只有一个索引?

What is an index in Elasticsearch? Does one application have multiple indexes or just one?

假设您为某个汽车制造商构建了一个系统.它涉及人员、汽车、备件等.您是否有一个名为制造商的索引,或者您是否有一个用于人员的索引,一个用于汽车,第三个用于备件?有人能解释一下吗?

Let's say you built a system for some car manufacturer. It deals with people, cars, spare parts, etc. Do you have one index named manufacturer, or do you have one index for people, one for cars and a third for spare parts? Could someone explain?

推荐答案

好问题,答案比人们想象的要微妙得多.您可以将索引用于多种不同的目的.

Good question, and the answer is a lot more nuanced than one might expect. You can use indices for several different purposes.

最简单和最熟悉的布局克隆了您对关系数据库的期望.你可以(非常粗略地)把索引想象成一个数据库.

The easiest and most familiar layout clones what you would expect from a relational database. You can (very roughly) think of an index like a database.

  • MySQL => 数据库 => 表 => 行/列
  • ElasticSearch => 索引 => 类型 => 具有属性的文档

一个 ElasticSearch 集群可以包含多个 Indices(数据库),这些数据库又包含多个 Types(表).这些类型包含多个Documents(行),每个文档都有Properties(列).

An ElasticSearch cluster can contain multiple Indices (databases), which in turn contain multiple Types (tables). These types hold multiple Documents (rows), and each document has Properties (columns).

因此,在您的汽车制造场景中,您可能有一个 SubaruFactory 索引.在这个索引中,您有三种不同的类型:

So in your car manufacturing scenario, you may have a SubaruFactory index. Within this index, you have three different types:

  • 汽车
  • 备件

然后每种类型都包含对应于该类型的文档(例如,一个 Subaru Imprezza 文档位于 Cars 类型中.该文档包含有关该特定汽车的所有详细信息).

Each type then contains documents that correspond to that type (e.g. a Subaru Imprezza doc lives inside of the Cars type. This doc contains all the details about that particular car).

搜索查询的格式为:http://localhost:9200/[索引]/[类型]/[操作]

所以要检索 Subaru 文档,我可以这样做:

So to retrieve the Subaru document, I may do this:

  $ curl -XGET localhost:9200/SubaruFactory/Cars/SubaruImprezza

.

现在,实际情况是索引/类型比我们在 RDBM 中使用的数据库/表抽象灵活得多.它们可以被视为方便的数据组织机制,并根据您设置数据的方式增加了性能优势.

Now, the reality is that Indices/Types are much more flexible than the Database/Table abstractions we are used to in RDBMs. They can be considered convenient data organization mechanisms, with added performance benefits depending on how you set up your data.

为了演示一种完全不同的方法,很多人使用 ElasticSearch 进行日志记录.标准格式是为每一天分配一个新索引.您的索引列表可能如下所示:

To demonstrate a radically different approach, a lot of people use ElasticSearch for logging. A standard format is to assign a new index for each day. Your list of indices may look like this:

  • 日志-2013-02-22
  • 日志-2013-02-21
  • 日志-2013-02-20

ElasticSearch 允许你同时查询多个索引,所以这不是问题:

ElasticSearch allows you to query multiple indices at the same time, so it isn't a problem to do:

  $ curl -XGET localhost:9200/logs-2013-02-22,logs-2013-02-21/Errors/_search=q:"Error Message"

同时搜索最近两天的日志.由于日志的性质,这种格式具有优势 - 大多数日志从未被查看过,并且它们以线性时间流进行组织.为每个日志创建索引更合乎逻辑,并提供更好的搜索性能.

Which searches the logs from the last two days at the same time. This format has advantages due to the nature of logs - most logs are never looked at and they are organized in a linear flow of time. Making an index per log is more logical and offers better performance for searching.

.

另一种完全不同的方法是为每个用户创建一个索引.想象一下你有一个社交网站,每个用户都有大量的随机数据.您可以为每个用户创建一个索引.您的结构可能如下所示:

Another radically different approach is to create an index per user. Imagine you have some social networking site, and each users has a large amount of random data. You can create a single index for each user. Your structure may look like:

  • 扎克的索引
    • 兴趣类型
    • 好友类型
    • 图片类型
    • 兴趣类型
    • 好友类型
    • 图片类型

    请注意如何以传统的 RDBM 方式轻松完成此设置(例如用户"索引,将爱好/朋友/图片作为类型).然后,所有用户都将被放入一个单一的巨大索引中.

    Notice how this setup could easily be done in a traditional RDBM fashion (e.g. "Users" Index, with hobbies/friends/pictures as types). All users would then be thrown into a single, giant index.

    相反,出于数据组织和性能原因,有时将数据分开是有意义的.在这种情况下,我们假设每个用户都有很多的数据,我们希望它们分开.ElasticSearch 让我们为每个用户创建索引没有问题.

    Instead, it sometimes makes sense to split data apart for data organization and performance reasons. In this scenario, we are assuming each user has a lot of data, and we want them separate. ElasticSearch has no problem letting us create an index per user.

    这篇关于Elasticsearch 中的索引是什么的文章就介绍到这了,希望我们推荐的答案对大家有所帮助,也希望大家多多支持IT屋!

查看全文
登录 关闭
扫码关注1秒登录
发送“验证码”获取 | 15天全站免登陆