ElasticSearch 作为主数据存储对写入丢失、数据可用性等因素的可靠性如何 [英] How reliable is ElasticSearch as a primary datastore against factors like write loss, data availability

查看:36
本文介绍了ElasticSearch 作为主数据存储对写入丢失、数据可用性等因素的可靠性如何的处理方法,对大家解决问题具有一定的参考价值,需要的朋友们下面随着小编来一起学习吧!

问题描述

我正在开展一个项目,该项目需要提供一个通用仪表板,用户可以在其中对不同的字段进行不同类型的分组、过滤和下钻.为此,我们正在寻找一个允许对数据进行切片和切块的搜索存储.

I am working on a project with a requirement of coming up with a generic dashboard where a users can do different kinds of grouping, filtering and drill down on different fields. For this we are looking for a search store that allows slice and dice of data.

会有多个数据源并将其存储在搜索存储中.可能需要对源数据进行一些预计算,这些预计算可由中间组件完成.

There would be multiple sources of data and would be storing it in the Search Store. There may be some pre-computation required on the source data which can be done by an intermediate components.

我浏览了几个博客以了解 ES 是否也可以可靠地用作主数据存储.这主要取决于我们正在寻找的用例.我们所拥有的用例的一些信息:

I have looked through several blogs to understand whether ES can be used reliably as a primary datastore too. It mostly depends on the use-case we are looking for. Some of the information about the use case that we have :

  • 每年约有 3 亿条记录,大小为 1-2 KB.
  • 假设存储 1 年的数据,我们现在有 300 GB,但随着数据的增长,用例可能会增加到 400-500 GB.
  • 目前还不确定我们将如何推送数据,但大致可以达到每 5 分钟约 2-3 百万条记录.
  • 搜索请求较低,但需要复杂的查询,可以搜索过去 6 周到 6 个月的数据.
  • 文档将被索引到文档中的几乎所有字段.

一些博客说它足够可靠,可以用作主要数据存储 -

Some blogs say that it is reliable enough to use as a primary data store -

还有一些博客说 ES 的限制很少——

And some blogs say that ES have few limitations -

在没有像 PostgreSQL、DynamoDB 或 RDS 这样的主存储的情况下,是否有人将 Elastic Search 用作数据的唯一真实性?我查到 ES 存在某些问题,例如脑裂和索引损坏,其中可能存在数据丢失问题.所以,我想知道是否有人使用过 ES 并且遇到过数据问题

Has anyone used Elastic Search as the sole truth of data without having a primary storage like PostgreSQL, DynamoDB or RDS? I have looked up that ES has certain issues like split brains and index corruption where there can be a problem with the data loss. So, I am looking to know if anyone has used ES and have got into any troubles with the data

谢谢.

推荐答案

简短回答:这取决于您的用例,但您可能不想将其用作主要商店.

更长的答案:您应该真正了解在弹性和数据丢失方面可能出现的所有问题.Elastic 有一些这些问题的优秀文档,在使用之前您应该真正了解这些文档作为主要数据存储.此外,Aphyr 关于该主题的帖子 是一个很好的资源.

Longer answer: You should really understand all of the possible issues that can come up around resiliency and data loss. Elastic has some great documentation of these issues which you should really understand before using it as a primary data store. In addition Aphyr's post on the topic is a good resource.

如果您了解您所承担的风险,并且您认为这些风险是可以接受的(例如,因为少量数据丢失对您的应用程序来说不是问题),那么您应该放心尝试.

If you understand the risks you are taking and you believe that those risks are acceptable (e.g. because small data loss is not a problem for your application) then you should feel free to go ahead and try it.

这篇关于ElasticSearch 作为主数据存储对写入丢失、数据可用性等因素的可靠性如何的文章就介绍到这了,希望我们推荐的答案对大家有所帮助,也希望大家多多支持IT屋!

查看全文
登录 关闭
扫码关注1秒登录
发送“验证码”获取 | 15天全站免登陆