使用ElasticSeach作为我的数据库一部分的主要来源 [英] Using ElasticSeach as primary source for part of my DB

查看:59
本文介绍了使用ElasticSeach作为我的数据库一部分的主要来源的处理方法,对大家解决问题具有一定的参考价值,需要的朋友们下面随着小编来一起学习吧!

问题描述

我已经看到许多与此主题类似的问题(

I have seen many similar question to this topic (including this one, which talks about how ElasticSearch version 6 has overcome many of its limitations as the primary data store), but I am still not clear on the following:

我正在创建一个在线购物网站,并且使用MySQL作为数据库.

I am creating an online shopping website and I am using MySQL as my DB.

这是我的数据库的简化版本(用户可以在销售网站上发布产品)

This is a simplified version of my DB (Users can post Product on the website for sale)

我正在了解 ElasticSearch ,我想用它来搜索我网站上的产品.我不需要搜索 User ProductReview -仅需要 Product 表.

I am learning about ElasticSearch and I want to use it to search the products on my website. I don't need User and ProductReview to be searched - only Product table.

我可以想到两种解决方案来实现这一目标:

I can think of 2 solutions to achieve this:

  1. 定期将 Product 表从MySQL复制到ES
  2. 在MySQL中保留 User ProductReview ,在ES中保留产品
  1. Periodically copy Product table from MySQL to ES
  2. Keep User and ProductReview in MySQL and Product in ES

据我所知,如果使用选项1,则可以使用 go-mysql -elasticsearch 将ES与MySQL同步:这是一个好的解决方案吗?

As far as I know, if I use option 1, then I can use go-mysql-elasticsearch to sync ES with MySQL: Is this a good solution?

我更倾向于使用选项2,因为它更容易使用,并且我不必担心数据同步.我对此选项感到担心的是:

I am more tempted to use option 2, as it is easier and I don't need to worry about data synchronization. What concerns me about this option is:

  • ES是否可靠地成为主要数据来源?
  • 在某个时间点,如果我必须修改产品表结构,是否能够在不删除和重新创建产品索引的情况下进行修改?
  • 如果使用MySQL,我通常会备份Prod DB并将其还原到Test DB上...是否仍然可以使用ES从Prod到Test进行备份和还原?

我没有使用ES/NoSQL的经验,并且希望获得任何建议.

I have no experience with ES/NoSQL and would appreciate any advice.

推荐答案

让我首先说明严格意义上的Elasticsearch不是数据库,并且应该因此.但是,没有什么可以阻止您执行此操作(许多人正在执行此操作),据Elastic的好伙伴说,他们将永远不会努力将ES变成一个真正的数据库. ES的主要目标是成为一个快速,可靠的搜索和分析引擎.

Let me start by stating that Elasticsearch is NOT a database, in the strict sense of the term, and should ideally not be used as such. However, nothing prevents you from doing it (and many people are doing it) and according to the good folks at Elastic, they won't ever strive to try and make ES a real database. The main goal of ES is to be a fast and reliable search and analytics engine, period.

如果可以的话,您应该始终保留另一个主要的真理来源,以便在出现问题的时候随时轻松地(重新)建立ES索引.

If you can, you should always keep another primary source of truth from which you can easily (re-)build your ES indices anytime if something goes south.

在您的情况下,方法1似乎是可行的方法,因为您要做的就是允许用户搜索您的产品,因此在ES中同步其他表毫无意义.

In your case, option 1 seems to be the way to go since all you want to do is to allow users to search your products, so there's no point in synching the other tables in ES.

选项2听起来很吸引人,但是仅当您决定只使用ES时,如果您要依赖事务(ES不具有事务支持),则实际上不应该这样做.您需要知道的另一件事是,如果您仅在ES中拥有数据,并且由于某种原因(升级期间,ES中的错误,代码中的错误等)导致索引损坏,则数据将消失并且您的业务会受苦的.

Option 2 sounds appealing, but only if you decide to go only with ES, which you really shouldn't if you want to rely on transactions (ES doesn't have transactional support). Another thing you need to know is that if you only have your data in ES and your index gets corrupted for some reason (during an upgrade, a bug in ES, a bug in your code, etc), your data is gone and your business will suffer.

因此,为了更准确地回答您的问题:

So to answer your questions more precisely:

  1. 只要您在游戏中投入了足够的精力和金钱,ES便可以作为可靠的主要来源.但是,您可能还没有数百万个产品和用户(因此),因此,具有至少三个节点的HA群集来搜索几千个产品和几个字段的花费似乎并不值得.

  1. ES can be reliable as a primary source of truth provided you throw enough efforts and money into the game. However, you probably don't have millions of products and users (yet), so having a HA cluster with minimum three nodes to search a few thousands products with a few fields doesn't seem like a good spend.

当您的产品表更改时,很容易将表重新索引到ES中(甚至在零停机别名技术,您可以可以做到而又不影响您的用户.

When your products table changes, it is easy to reindex the table into ES (or even in real time) and if you have a few thousand products, it can go fast enough that you don't really have to worry about it. If the synch fails for some reason, you can run the process again without wasting too much time. With the zero-downtime alias technique, you can do it without impacting your users.

ES还提供了快照/还原功能,因此您可以拍摄PROD快照并通过一次REST调用将其安装到TEST群集中.

ES also provides snapshot/restore capabilities so that you can take a snapshot of PROD and install it in your TEST cluster with a single REST call.

这篇关于使用ElasticSeach作为我的数据库一部分的主要来源的文章就介绍到这了,希望我们推荐的答案对大家有所帮助,也希望大家多多支持IT屋!

查看全文
登录 关闭
扫码关注1秒登录
发送“验证码”获取 | 15天全站免登陆