MySQL的弹性搜索用法 [英] ElasticSearch usage with MySQL

查看:93
本文介绍了MySQL的弹性搜索用法的处理方法,对大家解决问题具有一定的参考价值,需要的朋友们下面随着小编来一起学习吧!

问题描述

我正在使用ElasticSearch作为网站的搜索组件。被索引并最终搜索的数据是保存在MySQL数据库中的相同数据。



我的方法是在相应的CRUD MySQL操作发生时添加/删除/修改索引中的数据。



例如,创建操作看起来像这样:

  public function savePost(Request $ request){
//首先,创建对象并将其保存到MySQL
$ post = new Post();
$ post-> title = $ request-> title;
$ post-> body = $ request-> body;
// ...
//等等
$ post-> save();

//其次,索引这个新数据:
$ elasticSearchClient = ClientBuilder :: create() - > build();

$ params = [
'index'=> 'some_index_elasticsearch',
'id'=> $ post-> id,
'type'=> 'post',
'timestamp'=> time(),
'body'=> [
'id'=> $ post-> id,
'title'=> $ post-> title,
'body'=> $ post-> body,
// ...等等
],
];

$ elasticSearchClient-> index($ params);

}

如果数据在MySQL中被删除/更新,我会只需删除它或从索引更新。



这是使用ElasticSearch(或任何其他类似Sphinx类似技术)的MySQL的正确方法吗?或者你会建议一个更好的方法来使用MySQL作为更多的ElasticSearch的数据源? (在这里真的没有发生,因为ElasticSearch和MySQL之间没有任何交互)。



我正在使用 https://github.com/elastic/elasticsearch-php 与ElasticSearch进行交互,如果它有任何区别。



只是为了澄清:这种方法到目前为止工作 - 我只是不确定是否是正确的方式,或者如果有人可以看到我可能会遇到的问题遇到这种做事的方式。

解决方案

使用Elasticsearch没有正确的方法。 对是相对的,所以正确的方式是支持你的用例的一种方式。 Elasticsearch不仅适用于一个具体的用例,而且对于越来越多的多个用例也是有效的。



您描述的情况是完全有效的,即索引ES在另一个RDBMS(如MySQL)中拥有的任何内容,并确保索引的内容与真实的主要来源同步。



在您的用例中有一件困难的事情您需要牢记,您必须保证MySQL和ES始终为1:1,并且由于各种原因而不一定很容易做到:




  • 如果您需要将ES下载到维护中,会发生什么情况,但您的应用程序必须由于任何原因而停留?

  • 如果ES中出现问题,会发生什么一个文件没有被索引/更新/删除? (记住没有交易支持)



还有一些同步MySQL和ES的脆弱性,例如使用binlog



你需要问自己这些问题,并找出一个策略来减轻这些潜在的问题,因为我可以向你保证(和其他人)一定会出现。

$ b总而言之,您的架构并没有问题,成千上万的公司也会做同样的事情,但如果您的同步计划向南,则需要制定计划。


I'm using ElasticSearch for the search component of a site. The data that is being indexed and eventually searched is the same data that is being saved in a MySQL DB.

My approach to this is to add/delete/modify data in the index when the corresponding CRUD MySQL operation happens.

For instance, a create operation looks something like this:

public function savePost(Request $request) {
    //Firstly, create the object and save it to MySQL
    $post = new Post();
    $post->title = $request->title;
    $post->body = $request->body;
    //...
    //and so on
    $post->save();

    //Secondly, index this new data:
    $elasticSearchClient = ClientBuilder::create()->build();

    $params = [
        'index' => 'some_index_elasticsearch',
        'id' =>  $post->id,
        'type' => 'post',
        'timestamp' => time(),
        'body' => [
            'id' => $post->id,
            'title' => $post->title,
            'body' => $post->body,
            //... and so on
        ],
    ];

    $elasticSearchClient->index($params);

}

If the data is deleted/updated in MySQL I'd just delete it or update it from the index.

Is this the right approach to using MySQL with ElasticSearch (or any other comparable technology like Sphinx)? or would you recommend a better approach to using MySQL as a more of a data source for ElasticSearch? (which really isn't happening at all here because there is no interaction between ElasticSearch and MySQL at all).

I'm using https://github.com/elastic/elasticsearch-php to interact with ElasticSearch if it makes any difference.

Just to clarify: this approach does work so far - I'm just not sure if it is the right way, or if anyone can see problems that I may run into with this way of doing things.

解决方案

There is no "right way" to use Elasticsearch. "Right" is relative, so the "right way" is a way that supports your use case(s). Elasticsearch doesn't only work for one specific use case, but for increasingly many more than one use case.

The case you describe is a perfectly valid one, i.e. indexing in ES whatever content you have in another RDBMS such as MySQL and making sure the indexed content is in synch with the primary source of truth.

One difficult thing in your use case that you need to keep in mind is that you have to guarantee that MySQL and ES are always 1:1 in synch, and that's not necessarily easy to do for various reasons:

  • what happens if you need to bring ES down for maintenance, but your app has to stay up for whatever reason?
  • what happens if there's an issue in ES and a document doesn't get indexed/updated/deleted? (remember there's no transactional support)

There are otherways to synch MySQL and ES that are less brittle, e.g. by using the binlog.

You need to ask yourself those questions and figure out a strategy to mitigate those potential problems, because I can assure you they (and others) will definitely arise.

To sum up, there's no problem with your architecture, thousands of companies do the exact same thing, however, you need to have a plan if your synch plan goes south.

这篇关于MySQL的弹性搜索用法的文章就介绍到这了,希望我们推荐的答案对大家有所帮助,也希望大家多多支持IT屋!

查看全文
登录 关闭
扫码关注1秒登录
发送“验证码”获取 | 15天全站免登陆