如何使用多个实体绑定来设置ElasticSearch索引结构 [英] How to setup ElasticSearch index structure with multiple entity bindings

查看:167
本文介绍了如何使用多个实体绑定来设置ElasticSearch索引结构的处理方法,对大家解决问题具有一定的参考价值,需要的朋友们下面随着小编来一起学习吧!

问题描述



最近我开始使用ElasticSearch(ES)实现到使用MySQL编写的旧版电子商务应用程序中。我完全是全新的,所有这些东西,阅读文档是好的,但我真的需要有经验的人来告诉我。



从ES文档我能够设置一个新的集群,我还发现,河流被弃用,应该被替换,所以我用Logstash和JDBC MySQL连接器替换它们。



在这一点上我有: / p>


  • ElasticSearch

  • Logstash

  • JDBC MySQL驱动程序

  • MySQL服务器



应用程序的数据库结构并不是最佳的,很难替代,但是我想以最好的方式将其复制到ES索引中。



数据库结构:



产品

  + --------------- ---------------- + ------- + -------- + 
| Id |标题|价格|
+ ------------------------------- + ------- + ----- --- +
| 00c8234d71c4e94f725cd432ebc04 |阿尔法| 589,00 |
| 018357657529fef056cf396626812 | Beta | 355,00 |
| 01a2c32ceeff0fc6b7dd4fc4302ab |伽玛| 0,00 |
+ ------------------------------- + ------- + ----- --- +

标志

  + ------------ + ------------- + 
| Id |标题|
+ ------------ + ------------- +
|出售| Sellout |
|折扣|折扣|
| topproduct |热门产品|
+ ------------ + ------------- +

flags产品(n:m pivot)

  ------ + ------------------------------- + ----------- -  + ------------ + 
| Id | ProductId | FlagId | ExternalId |
+ ------ + ------------------------------- + ------ ------ + ------------ +
| 1552 | 00c8234d71c4e94f725cd432ebc04 |出售|空|
| 2845 | 00c8234d71c4e94f725cd432ebc04 | topproduct | NULL |
| 9689 | 018357657529fef056cf396626812 |折扣| NULL |
| 4841 | 01a2c32ceeff0fc6b7dd4fc4302ab |折扣| NULL |
+ ------ + ------------------------------- + ------ ------ + ------------ +

那些字符串ID是一个完整的灾难(但我现在必须处理它们)。起初我以为我应该对ES的产品索引进行平面结构,但是多个实体绑定呢?

解决方案

很好的开始!



我一定会把它全部弄平(即 denormalize ),并找出如下所示的产品文档。这样,您可以通过为每个产品创建一个 flags 数组来消除产品和标志之间的N:M关系。因此更容易查询这些标志。

  {
id:00c8234d71c4e94f725cd432ebc04,
title:Alpha,
price:589.0,
flags:[Sellout,Top Product]
}
{
id:018357657529fef056cf396626812,
title:Beta,
price:355.0,
flags:[Discount]
}
{
id:01a2c32ceeff0fc6b7dd4fc4302ab,
title:Gamma,
price:0.0,
flags ]
}

产品映射类型如下所示:

  PUT products 
{
mappings:{
product:{
properties :{
id:{
type:string,
index:not_analyzed
},
title:{
type:string
},
price:{
type:double,
null_value:0.0
},
flags:{
type:string,
index :not_analyzed
}
}
}
}
}

由于您已经有logstash jdbc 输入,所有您缺少的是正确的SQL查询来获取产品和关联的标志。 / p>

  SELECT p.Id as id,p.Title as title,p.Price as price,GROUP_CONCAT(f.Title)as flags 
FROM产品p
JOIN flags产品fp ON fp.ProductId = p.Id
JOIN标志f ON fp.FlagId = f.id
GROUP BY p.Id

哪些会让你像这样的行:

  + ------------------------------- + ------- +  - ----- + --------------------- + 
| id |标题|价格|标志|
+ ------------------------------- + ------- + ----- - + --------------------- +
| 00c8234d71c4e94f725cd432ebc04 |阿尔法| 589 | Sellout,热门产品|
| 018357657529fef056cf396626812 | Beta | 355 |折扣|
| 01a2c32ceeff0fc6b7dd4fc4302ab |伽玛| 0 |折扣|
+ ------------------------------- + ------- + ----- - + --------------------- +

使用Logstash过滤器,您可以将标志分割成数组,你很好去。



recently I started working on ElasticSearch (ES) implementation into legacy e-commerce app written in PHP using MySQL. I am completely new to all this stuff and reading the docs is fine, yet I really need somebody with experience to advise me.

From the ES documentation I was able to setup a new cluster and I also found out that rivers are deprecated and should be replaced, so I replaced them with Logstash and JDBC MySQL connector.

At this point I have:

  • ElasticSearch
  • Logstash
  • JDBC MySQL driver
  • MySQL server

The database structure of the application is not really optimal and is very hard to replace, but I'd like to replicate it into the ES index in the best possible way.

DB Structure:

Products

+-------------------------------+-------+--------+
|              Id               | Title | Price  |
+-------------------------------+-------+--------+
| 00c8234d71c4e94f725cd432ebc04 | Alpha | 589,00 |
| 018357657529fef056cf396626812 | Beta  | 355,00 |
| 01a2c32ceeff0fc6b7dd4fc4302ab | Gamma | 0,00   |
+-------------------------------+-------+--------+

Flags

+------------+-------------+
|     Id     |    Title    |
+------------+-------------+
| sellout    | Sellout     |
| discount   | Discount    |
| topproduct | Top Product |
+------------+-------------+

flagsProducts (n:m pivot)

+------+-------------------------------+------------+------------+
|  Id  |           ProductId           |   FlagId   | ExternalId |
+------+-------------------------------+------------+------------+
| 1552 | 00c8234d71c4e94f725cd432ebc04 | sellout    | NULL       |
| 2845 | 00c8234d71c4e94f725cd432ebc04 | topproduct | NULL       |
| 9689 | 018357657529fef056cf396626812 | discount   | NULL       |
| 4841 | 01a2c32ceeff0fc6b7dd4fc4302ab | discount   | NULL       |
+------+-------------------------------+------------+------------+

Those string IDs are a complete disaster (but I have to deal with them now). At first I thought I should do a flat structure of Products index to ES, but what about multiple entity bindings?

解决方案

That's a great start!

I would definitely flatten it all out (i.e. denormalize) and come up with product documents that look like the one below. That way you get rid of the N:M relationship between products and flags by simply creating a flags array for each product. It will thus be easier to query those flags.

{
   "id": "00c8234d71c4e94f725cd432ebc04",
   "title": "Alpha",
   "price": 589.0,
   "flags": ["Sellout", "Top Product"]
}
{
   "id": "018357657529fef056cf396626812",
   "title": "Beta",
   "price": 355.0,
   "flags": ["Discount"]
}
{
   "id": "01a2c32ceeff0fc6b7dd4fc4302ab",
   "title": "Gamma",
   "price": 0.0,
   "flags": ["Discount"]
}

The product mapping type would look like this:

PUT products
{
    "mappings": {
        "product": {
            "properties": {
                "id": {
                    "type": "string",
                    "index": "not_analyzed"
                },
                "title": {
                    "type": "string"
                },
                "price": {
                    "type": "double",
                    "null_value": 0.0
                },
                "flags": {
                    "type": "string",
                    "index": "not_analyzed"
                }
            }
        }
    }
}

Since you have the logstash jdbc input already, all you're missing is the proper SQL query to fetch the products and associated flags.

  SELECT p.Id as id, p.Title as title, p.Price as price, GROUP_CONCAT(f.Title) as flags
    FROM Products p
    JOIN flagsProducts fp ON fp.ProductId = p.Id
    JOIN Flags f ON fp.FlagId = f.id
GROUP BY p.Id

Which would get you rows like these:

+-------------------------------+-------+-------+---------------------+
| id                            | title | price | flags               |
+-------------------------------+-------+-------+---------------------+
| 00c8234d71c4e94f725cd432ebc04 | Alpha |   589 | Sellout,Top product |
| 018357657529fef056cf396626812 | Beta  |   355 | Discount            |
| 01a2c32ceeff0fc6b7dd4fc4302ab | Gamma |     0 | Discount            |
+-------------------------------+-------+-------+---------------------+

Using Logstash filters you can then split the flags into an array and you're good to go.

这篇关于如何使用多个实体绑定来设置ElasticSearch索引结构的文章就介绍到这了,希望我们推荐的答案对大家有所帮助,也希望大家多多支持IT屋!

查看全文
登录 关闭
扫码关注1秒登录
发送“验证码”获取 | 15天全站免登陆