如何使用多个实体绑定设置 ElasticSearch 索引结构 [英] How to setup ElasticSearch index structure with multiple entity bindings

查看:30
本文介绍了如何使用多个实体绑定设置 ElasticSearch 索引结构的处理方法,对大家解决问题具有一定的参考价值,需要的朋友们下面随着小编来一起学习吧!

问题描述


最近,我开始致力于将 ElasticSearch (ES) 实现到使用 MySQL 用 PHP 编写的旧电子商务应用程序中.我对所有这些东西都是全新的,阅读文档很好,但我真的需要有经验的人来给我建议.


recently I started working on ElasticSearch (ES) implementation into legacy e-commerce app written in PHP using MySQL. I am completely new to all this stuff and reading the docs is fine, yet I really need somebody with experience to advise me.

从 ES 文档中,我能够设置一个新的集群,并且我还发现河流已被弃用并且应该被替换,因此我将它们替换为 Logstash 和 JDBC MySQL 连接器.

From the ES documentation I was able to setup a new cluster and I also found out that rivers are deprecated and should be replaced, so I replaced them with Logstash and JDBC MySQL connector.

此时我有:

  • 弹性搜索
  • Logstash
  • JDBC MySQL 驱动程序
  • MySQL 服务器

应用程序的数据库结构并不是最优化的,很难替换,但我想以最好的方式将其复制到 ES 索引中.

The database structure of the application is not really optimal and is very hard to replace, but I'd like to replicate it into the ES index in the best possible way.

数据库结构:

产品

+-------------------------------+-------+--------+
|              Id               | Title | Price  |
+-------------------------------+-------+--------+
| 00c8234d71c4e94f725cd432ebc04 | Alpha | 589,00 |
| 018357657529fef056cf396626812 | Beta  | 355,00 |
| 01a2c32ceeff0fc6b7dd4fc4302ab | Gamma | 0,00   |
+-------------------------------+-------+--------+

标志

+------------+-------------+
|     Id     |    Title    |
+------------+-------------+
| sellout    | Sellout     |
| discount   | Discount    |
| topproduct | Top Product |
+------------+-------------+

flagsProducts(n:m 枢轴)

+------+-------------------------------+------------+------------+
|  Id  |           ProductId           |   FlagId   | ExternalId |
+------+-------------------------------+------------+------------+
| 1552 | 00c8234d71c4e94f725cd432ebc04 | sellout    | NULL       |
| 2845 | 00c8234d71c4e94f725cd432ebc04 | topproduct | NULL       |
| 9689 | 018357657529fef056cf396626812 | discount   | NULL       |
| 4841 | 01a2c32ceeff0fc6b7dd4fc4302ab | discount   | NULL       |
+------+-------------------------------+------------+------------+

那些字符串 ID 完全是一场灾难(但我现在必须处理它们).一开始我想我应该对ES做一个Products索引的扁平结构,但是多实体绑定呢?

Those string IDs are a complete disaster (but I have to deal with them now). At first I thought I should do a flat structure of Products index to ES, but what about multiple entity bindings?

推荐答案

这是一个很好的开始!

我肯定会把它弄平(即非规范化)并提供如下所示的产品文档.这样,您只需为每个产品创建一个 flags 数组,就可以摆脱产品和标志之间的 N:M 关系.因此,查询这些标志会更容易.

I would definitely flatten it all out (i.e. denormalize) and come up with product documents that look like the one below. That way you get rid of the N:M relationship between products and flags by simply creating a flags array for each product. It will thus be easier to query those flags.

{
   "id": "00c8234d71c4e94f725cd432ebc04",
   "title": "Alpha",
   "price": 589.0,
   "flags": ["Sellout", "Top Product"]
}
{
   "id": "018357657529fef056cf396626812",
   "title": "Beta",
   "price": 355.0,
   "flags": ["Discount"]
}
{
   "id": "01a2c32ceeff0fc6b7dd4fc4302ab",
   "title": "Gamma",
   "price": 0.0,
   "flags": ["Discount"]
}

产品映射类型如下所示:

The product mapping type would look like this:

PUT products
{
    "mappings": {
        "product": {
            "properties": {
                "id": {
                    "type": "string",
                    "index": "not_analyzed"
                },
                "title": {
                    "type": "string"
                },
                "price": {
                    "type": "double",
                    "null_value": 0.0
                },
                "flags": {
                    "type": "string",
                    "index": "not_analyzed"
                }
            }
        }
    }
}

由于您已经有了 logstash jdbc 输入,您所缺少的只是获取产品和相关标志的正确 SQL 查询.

Since you have the logstash jdbc input already, all you're missing is the proper SQL query to fetch the products and associated flags.

  SELECT p.Id as id, p.Title as title, p.Price as price, GROUP_CONCAT(f.Title) as flags
    FROM Products p
    JOIN flagsProducts fp ON fp.ProductId = p.Id
    JOIN Flags f ON fp.FlagId = f.id
GROUP BY p.Id

这会让你得到这样的行:

Which would get you rows like these:

+-------------------------------+-------+-------+---------------------+
| id                            | title | price | flags               |
+-------------------------------+-------+-------+---------------------+
| 00c8234d71c4e94f725cd432ebc04 | Alpha |   589 | Sellout,Top product |
| 018357657529fef056cf396626812 | Beta  |   355 | Discount            |
| 01a2c32ceeff0fc6b7dd4fc4302ab | Gamma |     0 | Discount            |
+-------------------------------+-------+-------+---------------------+

使用 Logstash 过滤器,您可以将 flags 拆分成一个数组,然后就可以开始了.

Using Logstash filters you can then split the flags into an array and you're good to go.

这篇关于如何使用多个实体绑定设置 ElasticSearch 索引结构的文章就介绍到这了,希望我们推荐的答案对大家有所帮助,也希望大家多多支持IT屋!

查看全文
登录 关闭
扫码关注1秒登录
发送“验证码”获取 | 15天全站免登陆